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Remarks 



Th amendm nt on page 6 is th incorporation of matter from the last paragraph of 
page 166 of the inventor's paper. This paper is incorporated by reference into the 
patent application and is cited in foot note 1 1 on page 6. 

The amendment on page 30 is the correction of an error that a person of ordinary 
skill in the art would immediately recognize as an error 

Also included herewith is new evidence/arguments regarding patentability. It is given 
below. 



New evidence/arquments 

The applicants intend to submit new claims that depend from claims for which the applicants 
have received a Notice of Allowance (dated 18 April 2003). The applicants intend to submit 
these new claims before the expiration of the 3 month time period constituting the requested 
period for limited Suspension of Action under 37CFR 1 .103(c). 

It is also the wish of the applicants to pursue apparatus claims in future prosecution. Also 
enclosed are copies of eight publications. These copies of publicatiCHis are being sent to the 
USPTO as evidence of the patentability of apparatus claims/embodiments and of the support 
for such apparatus claims/embodiments in ^e patent application. More specifically ^e 
Examiner stated in the last paragraph of page 4 of the Final Office Action dated 02 OCT 
2002 that should the applicants wish to pursue apparatus claims in future prosecutksn that 
concrete examples are needed in the specification in order for 'means plus function" 
language to be interpreted and searched. Therefore these eight publications are being 
provided as evidence in support of patentability. 

Each of these publications is died in the specification of the patent application and is 
incorporated by reference into the application. More specifically see some of the concrete 
examples described on p. 24 tines 1 to 2, p. 29 lines 26 to 30 and p. 34 lines 3 to 18, of the 
patent application. Each of these sections of the application (and the associated publications 
in the endrK)tes) describe some concrete examples of technology in the patent application for 
the interpretatbn of "means plus function" language. These sections of the application also 
refer to publications in the endnotes. The publications in the erKinotes are incorporated by 
reference into the patent application. 



The copies of eight publications enclosed herein are tiie publications listed bebw: 
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n Weighing DNA for Fast Genetic Diagnosis. Science, March 27, 1998, vol. 279, pp. 2044- 
2045. 

2) Accessing Genetic Information with High>Density DNA Arrays , Mark Chee, et al. Science, 
vol 274, Oct 25, 1996 , pp. 610 - 614. 

y) Genetic analysis of amplified DNA with immobilized sequence- specific oligonucleotide 
probes, Saiki,et al. Proc Natl Acad Sci USA vol 86, pp.6230-6234. 

4) Allele-specific enzymatic amplification of p-globin genomic DNA for diagnosis of sickle 
cell anemia- Wu, et al., Proc Natl Acad Sci USA vol 86 pp 2757-2760. 

5) Automated DNA diagnostics using an Eltsa-based oligonucleotide ligation assay. 
Nickerson, et al., Proc Natl Acad Sci USA vol 87, pp. 8923-8927. 

6) Padlock Probes: Circularizing Oligonucleotides for Localized DNA Detection . Science, 
Sept. 30, 1994, vol. 265, pp. 2085-2088. 

7) SNP attack on complex traits . Nature Genetics, Nov. 1998, vol. 20 no. 3, pp, 217-218. 

8^ Large Scale Identification, Mapping, and Genotvping of Single-Nucleotide Polymorphisms 
in the Human Genome. W ang, et. al.. Science, May 15, 1998. vol 280, pp. 1077-1081. 

More spedficatly, publication 1} is an example of a technology that uses nnass 
spectrometry, specifically MALDITOF. for genotyping. Such a technology is a 
noniimiting concrete example of a technology that is described in the application and 
is used by apparatus versions of the Invention. And this concrete example allows for 
interpretation of means plus function language. 

Publication 2) is an example of a technology that uses oligonucleotides for 
gentotyping, specifically high-density DNA arrays. Such a technology is a noniimiting 
concrete example of a technology that is described in the application and is used by 
apparatus versions of the invention. A similar technology is described in publication 

8) , specifically genotyping chips. And each of these noniimiting concrete example 
allows for interpretation of means plus function language. 

Other examples of technologies that use oligonucleotides in various ways for 
genotyping. for example using PGR or other kinds of hybridization reactions, are 
described in the other publications enclosed herewith. Each of these technologies is 
a noniimiting concrete example of a technology that is described in the application 
and is used by apparatus versions of the invention. And each of these noniimiting 
concrete examples allows for interpretation of means plus function language. 

There are other similar publications cited (and incorporated by reference) in the 
application that are not enclosed herewith. And this discussion is not necessarily 
exhaustive. No technology cited herein is admitted to being prior art with respect to 
the invention by its mention or discussion in this submission. 

Since nely^ 

Robert O. McGinnis, Agent of Record, Reg. No. 44. 232 



Micro- 
beads' 



The San Diego researchers were looking 
for a way to help the righr peg find its hole, 
and they settled on DNA. The chemical 
bases that make up DNA — cytosine. gua- 
nine, adenine, and diyminc — will bind to 
each other only in particular pairings: C with 
G and A with T- Hence , a single strand made 
up of the bases ATTTCC will bind strongly 
with its complementary strand, TAAACG. 
and not with any other sequence. 
The researchers set out to exploit 
this selectivity by attaching short 
complementary strands of DNA to 
the pegs and substrate to help the 
devices find their conect positions. 

In dieir first experiment, the 
team coated a substrate with a par- 
ticular short strand of DNA. TTiey 
then covered parts of the substrate 
with a mask arKJ exposed it to ultra- 
violet light The light chemically 
altered the DNA in exposed areas 
so that it could no longer bind to 
complementary strands. The re- 
searchers then coated some microbcads — 
which acted as dummy devices — with strands 
of DNA complementary to those on the 
substrate. When a fluid carryirxg the coated 
beads was splashed over the substrate, the 
beads successfully bound only to those areas 
that had not been exposed to UV light One 
drawback of the technique is that it worked 



only for small devices, several hundred mi- 
crometers across, that would flow easily and 
not block other devices. 

in a second experiment, designed to show 
that several varying kinds of *'dc vices" could be 
deposited at once, the group used masks to 
deposit four different types of DNA strands 
onto a substrate and then attached comple- 
mentary strands to four different fluorescent 
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Nature's glue. DNA strands bind beads and substrate together. 



molecules* When die labeled molecules were 
splashed onto the substrate, the pattern of fluo- 
resceince showed that they had bound only 
to the appropriate regions of complementary 
DNA. In a real system, diis would mean that 
four completely different types of devices could 
be attached to many selected sites on a chip. 
The researchers realize, however, that just 
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providing the glue is not going to be enough. 
They arc now looking for more active ways to 
guide the devices to their correct positions. 
One possibility is to add extra chemical groups 
to the DNA on the devices co give diem an 
electric charge, then create electric fields on 
the substrate to attract the charged devices 
to "landing sites." The team is also inves- 
tigating other techniques, such as crearing 
currents in the fluid that 
would sweep the tiny de- ^ 
vices to the right places, t 
An even bigger chal- 1 
lenge will be creating an S 
electrical connection be- 
tween the devices and their 
host semiconductor. The 
team is looking at the possi- 
bility of putting the DNA 
glue on the top of devices 
and bonding them, upside 
down» onto a dummy sub- 
strate. Once all the devices 
are in position, the dummy 
could be flipped over and pressed down on the 
real substrate. The substrate might be coated " 
with molten solder, which would add an elec- 
trical bond to the mass marriage. 

-Sunny Bains 

Sunny Bains is a science, writer based in the San pj 
Fraxunsco Bay area. 
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Weighing DNA for Fast Genetic Diagnosis 

Xhe modem doctoi's litde black bag, al- 



ready overflowing with high-rech diagnostic 
devices, may soori have to make room for 
another advance. To diagnose a disease, 
judge future risks, or design a treatment, 
doctors will one day want to know which 
disease-related genes a patient carries. And 
they will want this diagnostic verdict to be as 
fast and accurate as a cholesterol or blood 
chemistry test today. As Charles Cantor, di- 
rector <rf Boston University's Center for Ad- 
vanced Biotechnology, puts it: "You need a 
detection system that can identify the gene 
sequences that you are looking for with high 
specificity, quickly, and in large volumes. 
The best analytical tool for doing this," he 
adds, "is mass spectrometry." 

Borrowed from chemistry, this technol- 
ogy is a sharp departure from current meth- 
ods, which identify a gene sequence by allow- 
ing it to bind to a matching probe, either on 
a gel or a chip. Instead, a mass spectrometer 
vaporizes the DNA and accelerates the mol- 
ecules through a vacuum chamber with the 
help of an electric field. Tiny differences in 
the time it takes the DNA fragments to reach 
the detector reveal small differences in their 
mass, and hence their sequence. 



The basic technique used for biomole- 
cules is one with an unwieldy name, matrix - 
assisted laser desorption/ioniiation-tirae-of- 
;light mass spectrometry, but a hannonious 
acronym, N4ALD1-TOF. It is now a decade 
old, but recent improvements have made it 
a hot commodity among companies 
hoping to conunercialize DNA analysis. 
"With today's technology, MALDl- 
TOF can analyze hundreds of DNA 
samples ... in a matter of a few min- 
utes," says Daniel P. Little, who di- 
rects mass-spectrometry development 
at Sequenom Inc., a San Diego-based 
company hoping to be generating di- 
agnostic products within 6 months. 

The standard way to distinguish 
different variants of a gene is to chop 
the DNA into fragments, separate them 
on a gel, and apply probes labeled with 
fluorescence or radioactivity, which 
bind to fragments with a particular 
sequence and light them up. But the 
process is slow and the gels can be hard 
to interpret. Newer techniques embed 
an array of different DNA probes on a 
single chip, allowing researchers to test 
for many gene variants at once. These 
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so-called DNA chips can screen DNA quickly/,'; 
But, as Cantor explains, the probes sometimes^] 
bind to sequences they don't completely 
match, which can limit die chips* accun»k.*Y. • 
Mass spectrometry may combine the DM A'j 
chip's speed with exquisite accuracy. The'„-^ 
technique has long offered chemists a fast wiaK 
to sort small molecules that vaporize naturally^- 
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All in the timing. A mass spectrometer sizes up DNA 
by vaporiiing and ionizing it. accelerating the mol* 
ecuies. and recording their arrtva) times at a deteclor- 
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^. or can be coaxed into a vapor with bursts of 
energy from a laser or ion beam. But vaporiz- 
^ ing large biomokcules while keeping them 
. intact once seemed impossible. A decade 

• . ago, however, Fram Hillenkamp and col- 

• leagues at Westfelische Wilhclms Univcniry 
' . in Miirwcer. Germany, found a way to do so 

• with proteins: Gxrystallize them with certain 
' . small molecules, colleccivcly called matrices. 
; When a nanosecond laser pulse vaporizes the 

matrix, the resulting puff of material gently 
lifts the ioniied biomolecule as well 
^ DN A was a tougher problem. But in 1993 , 
Christopher Becker, then at SRI Interna- 
i . * tional in Palo Alto, California, and now at 
i . GencTrace Systems in Menlo Park, Cali- 
fornia, found a simple matrix compound, 
I 3-hydroxypicolinic acid, that worked with 
: -DN A sequences 20 to 25 bases long. By trial 
• and error, MALDl practitioners have come 
j* up with several new matrices that work with 
|/ DNA fragments as long as 100 bases- 

The latest M ALDl-TOF machines allow 
L. the cloud of matrix molecules to dissipate 
f before applying an electric field. The field 
{' accelerates the charged DNA fragments 
[. toward a detector, and the differences in 
{ time of flight can reveal mass differences as 
I . small as 0.03%. If the DNA sequences from 
f a gene have the same length — as ihey do if 
i ; . they have been produced by the polymerase 
' . chain reaction — any departure from the mass 
of thfc nonrial sequence reflects a mutation 
that has deleted or added bases or substi- 
: ; ciited others that have a different mass. 
"The results are an absolute indicator of the 
presence or nbsence of specific DNA se- 
quences,** says Sequenom's Little. MALDl - 
J^'-TOF can distinguish gene variants that 
*; differ by as Utde as a single base pair, and it 
can also analyze microsatellites — stretches 
'}\- of two-, three-, or four-nucleotide repeats 
V often used as markers for locating disease- 
[' causing genes. 
: • Besides offering unmatched precision, 
. MALDI-TOF U inherently fast. The DNA 
forms a vapor and flies to the detector in 
fractions of a secoiKl; even repeating the pro- 
J.5.cess several times widi the same sample to 
^\ boost the sensitivity takes as little as 2 sec- 
i ■ * onds. By preparing the samples in a grid and 
y:- having the laser scan each spot in turn, a 
•'r MALDl-TOF instrument can analyze 100 
\^ samples or more in a matter of minutes, 
iy The combination of speed and accuracy 
L . could give the technique a role In genome 
!-•. sequeiKing as well as diagnosis. Standard, 
■; . \ Sanger- type DNA sequencing generates many 
'/ partial copies of a DNA sequence, each one 
starting atone end of the sequence and ending 
with a different one of the constituent bases. 
; - To determine the original sequence, biologists 
need to know the final base on each partial 
' copy, together with the copy's length. Doing 
so now requires reading hundreds of bands 



on gels. But by sending the mixture through a 
mass spectrometer, biologists could quickly 
read ofif the fragmcnrs* lengths and — ^from 
the mass differences between successive frag- 
ments — the final base on each one. Investi- 
gators at both GeneTrace and Sequenom 
have published sequences determined with 
MALDI-TOF, the latest orw, from Sequenom. 
appearing in the April hlamre Biottchnobgy. 

For practical gene sequencing, however, 
MALDI-TOF would have to work mdi DNA 
fragments much longer than the current 
100-base capacity. Becker reportedly has dis- 
covered a new proprietary matrix that he 
expects will extend MALDl-TOPs reach to 
1000-base sequences. "If you can really do 
upward of lOOO bases using this technique, 
and if it is indeed faster and cheaper, then 
this would be a big breakthrough for high- 
diioughput sequencing," says Jeffrey Polish, who 
works in Mark Johnston's sequencing labora- 
tory at Washington Univenity in St. Louis. 

In the meantime, the technology has no 
shortage of applications. Sequenom has shown, 
for example, that it can discriminate among 
30 of the mutations that cause cystic fibrosis 
and pick up polymorphisms in the apolipo- 



LOS Angeles — ^Refrigerator magnets are 
best known for holding shopping lists and old 
postcards onto refrigerator doors. But in a few 
years, much more powerful magnets could be 
the key to keeping food cold in so-called 
magnetocaloric refrigerators, which would 
be more energy efficient and less polluting 
than standard models. Now a new class of 
magnetocaloric materials, announced here 
last week at a meeting of the American Physi- 
cal Society, could make these magnetic refrig- 
erators more practical and vcrsarile. 

The magnetocaloric effect works when 
strong magnetic fields align quantum- 
mechanical "spir\s" of electrons within at- 
otns. This transition reduces one aspect of 
the randomness, or entropy, of the atoms. 
But according to laws of thermodyrxamics, 
some other aspect of randomness has to 
increase in compensation, so the atoms in- 
crease the randomness of their velocities — 
vibrating and heating up. Once this heat is 
carried away by a coolant such as water, the 
field is removed and the effect works in re- 
verse, chilling the material and cooling a 
refrigerator. To date, the peak performance 
has been with the element gadolinium. 

By adding various amounts of silicon and 
germanium to gadolinium's crystal lattice. 
Vital ij Pecharsky and Karl Cschneidner of 
the Ames Laboratory at Iowa State Univer- 
sity discovered a new class of materials that 
can chill two to six rimes further in a single 



protein E gene, which have been linked to 
familial hyperlipidemias, hean disease, and 
Alzheimer's disease. CcncTiacc has developed 
a mass specttomecry-based system that can 
analyze which genes are being expressed in 
cells by identifying expressed sequence tags, 
short stretches of DNA copied from the mes- 
senger RNA made by active genes. Knowing 
which genes are active in a assue can help 
pharmaceutical companies determine which 
ones arc good drug targets. 

With MALDI-TOF instruments running 
about $125,000 each — less than a standard 
clinical chemistry analyzer — these systems 
may also end up in large diagnostic labs. "Di- 
agnostics at the level of the gene is something 
that we know is valuable, but is difficult, slow, 
and expensive today," says David Cooper, 
chief scienafic officer, at Nichols Institute 
Reference Laboratories, a division of Quest 
Diagnostics, one of the big 3 national refer- 
ence laboratories. MALDI-TOF, he says, could 
be just the right medicine. 

-Joseph Alper 

Joseph Alper is a free-lance writer m Boulder. 
Colorado. 



magitctic cycle, meaning that the refrigera- 
tors could operate with weaker msgnetic 
fields or less material. Depending on Uie .^f 
germanium-to-silicon ratio, the new mate- 
rials also operate from about room rtrrvpera- 
ture all rhe way down to -253 c^egrres Cel- 
3ius. The cold end of the range would allow 
.nagnetocaloric freezers to iiquefy hydrogen ^ 
or natura I gas for use in c lean-humi ng power /; 
plants or future automobiles. - 

To come up with the new compounds, the 
team followed up on hints that magneto- 
caloric materials containir\g gadolinium and 
idther silicon or ^nnanium— but not both — 
prefer a different range of temperatures than 
gadolinium alone. "We're not trying to come 
up with exotic new compounds out of the 
pure blue sky,** says Gsclu\eidner. The sur- 
prise, he says, was that the magnetocaloric ef- 
fea turned out to be far larger when both ger- 
manium and silicon were added to the material. 

'These new materials give you a lot more 
flexibility in designing magnetocaloric Ire- 
frigerators]," says Carl Zimm, a senior scien- 
tist in magnetic refirigeratton at Astronau- 
tics Corporation of America in Madison, 
Wisconsin. The team is still working on 
making enough of the material to tr>' it our in 
Zimm*s prototype gadolinium-based refrig- 
erator, which has been running for about a 
year. The test should take place "within a 
couple of months," says Cschneidner. 

-J^mes Glanz 
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t>-OA M KCl. Tet-SF/ppUO was GMed with lr>- 
creashg salt conceniraUons and was detected 
mostly in 0.2 toO.4 M KO fractions. TTiese rracliom 
werepooted. diatyzed against buffer 0-0.1 MKCi. 
and loaded onto a gtutathiofia Sephaross (Phamia- 
cia) colwm oontaWnfl GST-Tat fusion proteins. After 
ttio column was washed wrth buffer D-0.4 M KCt. 
Tat-SF;^ 1 40 was elutedfrom Ihe column vUth butt- 
er 0 containing 1.4 M KQ. The estimated overall 
purification after these steps was -3O0O-f(A3. to the 
pjtpertment showwn in Fig. 3. the 0.2 to 0.4 M KO 
heparin Sepharose fraction containing Tal-SF activ- 
ttywassut^ctedto fractionation through an Af5-Gd 
10 matrix cotimn (Bb-Rad) containing immot^ized 
Tat. Tet-SF actMty was eWed trom the cotumn wfth 
increasing salt concentrations. The 0.6 M KO frac- 
tion was analyzed as descfft}8d in Fig. 3. 
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Accessing Genetic Information with 
High-Density DNA Arrays 

Mark Chee, Robert Yang, Earl Hubbell, Anthony Bemo, 
Xiaohua C. Huang, Davi<j Stem, Jim Winkler, David J. Lockhart, 
Macdonald S. Moms. Stephen P. A. Fodor . 

Rapid access to genetic information is centraf to the revolution taking place m molecular 
genetics. The simultaneous analysis of the entire hunrtan mitochondrial genome ts de- 
scribed here. DNA arrays containing up to 135,000 probes complementary to the 16.6- 
Wlobase human mitochondrial genome were ger>erated by light-directed chemical syn- 
thesis. A two-color labelkig scheme was developed that allows simultaneous compar- 
ison of a polymorphic target to a reference DNA or RNA. Complete hybridization patterns 
were revealed in a matter of minutes. Sequence polymorphisms were detected with 
single-base resolution and unprecedented efficiency. The methods described are ge- 
neric and can t^e used to address a variety of questions in motecidar genetics including 
gene expression, genetic tint^ge. and genetic variability. 



A central theme tn modem genetics is the 
relation between genetic variability and phe- 
nocype. To understand genetic variation and 
its consequences on biotogtcat function, an 
enormous effort in aimixirative sequence 
analysis will need to be canted out. Conven- 
tional nucleic acid sequencing technologies 
make use of analytical separation techniques 
to resolve sequence at the single nucleotide 
level (], 2). However, the effort required 
increases linearly with the amount of se- 
quence. In contrast, biological systems read, 
store, and modify genetic information by mo- 
lecular recognirion (3). B«:aittc each DNA 
strand carries with it the capacity to recognize 
a uniquely complementary sequence through 
base pairingi the process of recognition, or 
hybridization, is highly parallel, as every nu- 
cleotide in a targe sequence can in principle 
be queried at the same time. Thus, hybrid- 
ization can be used to efficiently analyze 
large amounts of nucleotide sequence. In one 
proposal, sequences are artalyzed by hybrid- 
ization to a set of oligonucleotides represent- 
ing all possible subsequences (4). A second 
approach, used here, is hybridization to an 
array of oligonucleotide probes designed to 
match specific sequences. In this way the 
most infomiative subset of probes is used. 
Umplemcncatton of these concepts relies on 
recently developed combinacorifll technolo- 
gies CO generate any ordered array of a large 
number of oligonucleotide probes (5). 
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The fundamentals of Itght-direcred oli- 
gonucleotide array synthesis have been de- 
scribed (5, 6). Any probe can be synthe- 
sized at any discrete, specifiesl location in 
the array, and any set of probes composed of 
die four nucleotidei can be syaihesized in a 
maximum of 4N cycles, where N is the 
length of the longest probe in the array. For 
example, the entire set of *-lO'^ 20-nucle- 
otide oligomer probes, cr any desired subset, 
can be synthesized in only 80 coupling cy- 
cles. The number of different probc5 rhat , 
can be synthesized is limited only by the 
physical size of the array and the achievable 
lithographic resolution (7). 

An array consisting of oligonucleotides 
complementary to subsequences of a target 
sequence can be used co cktermine the iden- 
tity of a target sequence, measure its amotuit, 
and detect differcruies between the target 
and a refereiKe sequence. Many dififerent 
arrays can be designed for these purposes. 
One such design^ termed a 4L riled array, is 
depicted in Fig. I A. In each set of four 
probes, the perfect complement will hybrid- 
ize more strongly than mismatched probes. 
By this approach, a nucleic acid target of 
length L can be scanned for mutaticois with 
a tiled array containing 4L probes. For ex- 
ample, to query the 16.569 base pairs (bp) of 
human mitochondrial DNA (mrDNA), only 
66,276 probes of the possible '^10* 15-nu- 
cleotide oligomers need to he used. 

The use of a tiled array probes to read a 
target sequence is illustrated in Fig. IC A 
tiled array of 1 S-nucleotide oligomers varied 
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at petition 7 ifrom the 3' end (P"-') was 
designed and synthesized for mtl» a cloned 
sequence containing 1311 bp spanning the 
control region of nuDNA {8-1 1 ). The upper 
panel of Rg. !C shows a portion of the fluo- 
rescence image of an array hybridi»d with 
fluoresccin-labcled mtl RNA (12), The base 
sequence can be read by comparing the inten- 
sities of the four probes within each column. 
For example, the column for position 16,493 
corisists of the four probes, 3'-TCACATAG- 
GCTGTAG. 3^TGACATCGGC^GTAG, 
3'.TGACATGGGCTGTAG. and 3'-TGA- 
CATTGGCTGTAG. The probe with the 
strongest sigr\al is the probe with the A sub- 
stitution (A, 49 counts; C, 8 counts, G, 1 5 
counts, and T, 8 counts, where the back- 
ground is 2 counts), identifying che base at 
position 16,493 as U in the RNA tran- 
script. Continuing the process, the se- 
quence at each position can be read directly 
from the hybriditation interxsities. 

The effect on che array hybridization 
pattern caused by a single base change in 
the target is illustrated in Fig. IB, arul the 
detection of a single-base polymorphism is 
shown in the lower panel of Fig. IC. The 
target was mt2, which differs from mtl in 
this region by a T-to-C transition at posi- 
tion 16,493. Accordingly, the probe with 
the G substitution (third row) displays che 
strongest signal. Because the tiled array was 
designed to complement mtl, the hybrid- 
ization intcr^ities of iteighboring probes 
that overlap position 16,493 are also afl^t- 
ed by che charige m target sequence. The 
hybridization signab of 15 probe sets of the 
1 5-nucIcotidc oligomer tiled array are per- 
turbed by a single base change in che target 
sequence. In the P*'*' array, each probe 
querying che eight positioris to the left and 
six positions to the right of the polymor- 
phism contain at least one mismatch to the 
target. The result is a characteristic loss of 
signal or a "footprint** for the probes flank- 
ing a mutation position. Of the four probes 
querying each position, the loss of signal is 
greatest for the one designed to match mtl. 
We denote the subset of probes with zero 
mismatches to the referecKe sequence as P^. 

A compariscoi of P*^ hybridization signals 
from a target to those from a reference is 
ideally obtained by hybridizing both sam- 
ples CO the same array. We therefore devel- 
oped a two-color labeling and detection 
scheme in which the reference is bbeled 
with phycoerythrin (red), ar\d the target 
with fluorescein (green) (i3). By processing 
the reference and target together, experi- 
mental variability during dhe fragmenta- 
tion, hybridization, washing, and detection 
steps is minimized or eliminated- In addi- 
tion, during cohybridtiation of the refer- 
ence and target, competition for btnduig 
sites results in a slight improvement in mis- 



match discrimination. Anay hybridization is 
highly reprcxiucible, and comparative anal- 
ysis of data obtained from separate but iden- 
tically synthesized arrays is also effective. 

The iwo-color approach was tested by an- 
alyzing a 2.5-kb region of mtDNA that spana 
the tRNA^^ cytochrome b. dlNA^. 
tRNA*'"\ control region, and tRNA"^* DN A 
sequences (14). A P**' array (20-nucleotide 
oligomer probes varied at position 9 from the 
3' end) was designed to match the mtl target 
(that is, P° sequerKe = mtl). The mtl ref- 
erence (red) and a polymorphic target sam- 
ple (green) were pooled and hybridized si- 
multaneously to the array. Diffiercrtccs be- 
tween the target and reference sequences 
were identified by comparing the scaled red 
and green P° hybridization interuttics (15). 
The marked decrease in target hybridization 
intensity, over a span of —20 nucleotides, is 
shown for a single-base polymorphism at po- 
sition 16,223 (Fig. 2 A). The footprint is 
enlarged when two polymorphisms occur in 
close proximity (within -^20 nucleotides) 
(Fig. 2B). When polymorphisms are clus- 
tered, the size of the footprint depeiuls on 



the number of polymorphisnu and their sep- 
aration (Fig. 2C). 

To read polymorphisms accurately, we 
developed an algorithm that addresses the 
issue of multiple mismatches. The algo- 
rithm performs base identification but also 
flags regions of ambiguity caused by multi- 
ple misfr\atches. These regions are easily 
identified by the presence of a large foot- 
print (Fig. 2, B and C) or by two or more 
bases identified as differing from P^ within 
the span of a sir\gle probe. Discrepancies 
between base identifications and footprint 
patterns ate also flagged for further analysis 
(for example, a P** fbocprinc in which no 
polymorphism is identified; such a pattern 
is typical of a deletion). Thus, base identi- 
ftcacions are valid only for unflagged re- 
gions. In flagged regions, the presence of 
sequence differences is deteaed. but no at- 
tempt is made to identify the sequence 
without further analysis. 

Sequence analysis was earned our on che 
2.5-kb target from 12 samples. A total of 
30,582 bp containtr\g 180 substitutioru rela- 
tive to mtl were analyzed. Ninety-eight per- 
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.TGAACTOTATOOQACAT. ; 
tgacatJ^ggctgtag 
tgacatcggetgtag 
tgftcatOgsctgtagr 

g&c&ta&gctgtaoa 
gac&taCgctgtaga 
gftcat&<;;getgtaaa 
^ptcfttatTgctgtaga 



t9acatA99ctgtag 
tgacatcmctgtag 
tgacatGggctacag 
tgacatTsgctgtag 
gacataAgctgtaga 

gacataGgctgtaga 
gacat^aTgct gt aga 



5' TOAACTGTATCCGACAT 
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Rg. 1 . (A) Oe^gn of a 41. tiled array. Each positkin In the 
target sequence (uppercase letters) is qiiened by a set 
0* tour probes on the chip (lowercase letters), iderrtical 
except at a single position, termed the stisstitubon po- 
sition, which is either A. G, or T (blue irxficates 
oompIementarHy. red a mismatch). Two sets of probes 
are shown, querying adjacent positions in the target. (B) 
Effect of a change In the target sequence. The ptxibBS 
are the same as tn (A), but the target now conteins a 
singte-bese substitution (base C, showvn in green). The 
prot>e set qu^ying the changed base stiS has a perfect 
match (the G probe). However, protses in adjacent sets 
that overlap the altered target position rw have ather 
one or two mismatches (red) ^t^d of zero or one, 
because they were designed to rr^tch tne target shown 
tn (C) Hybrtdization to a 4L tiled array and detection 
of 8 t>a3e char^ge k\ the target. The anay shown was 
designed to the mtl sequence. (Top) h^dization to 
mtl. The substitution used in each row probes is 
tntficated to the left of tr»e rnage. The target sequerx^ 
can t» read 5* to 3' from left to right as the corrtplerrwrn 
of the substitution base ^h the brtc^test signal V^h 
hytwdization to mt2 (botiorr>). which differs from mil in 
this re^on by a T-»C tran^tlon. the G prc^ at portion 
16,493 is now a perfect match, w^h t^e other ihrae 
probes having singte-base nr>isaiatches (A S. C 3. Q 37. 
T 4 counts). However, at flankhg positions, the prob&s 
have either single- or doubte-base nntOTiatches. be- 
cause the nrTt2 transition now occurs away fronr^ the 
query position. 
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cent of the sequence was unambiguously as- 
signed by a Bayesian base identification al- 
gorithm (i6). Of this 98%, which contained 
both wild-type sequence and a high propor- 
tion of single-base footprints such as the 
example shown in Fig. 2A, 29,878 out of 
29,879 hp were identified correctly {17). 
The remaining 2% of die sequence, which 
contained the multiple substitution foot- 
prints (such as chose shown in Fig. 2, B and 
C)» w?s flagged for further analysis. Of the 
649 bp composing this 2%, 643 bp were 
located in or immedi^itely adjacent to foot- 
prints {ISl In all. 179 out of the 180 poly- 
morphisms were uiiambtguously detected, 
126 out of 127 were identified correcdy in 
the unflagged regionsi and 53 polymor- 
l^isais occuring in the flagged regions were 
deteaed as footprints. There were no un- 
flagged fatse-positive base identifications, 
and only one false-positive footprint These 
figures can be considered to be "woist case" 
estimates for the type of array and target 
used. The sequence represents a Cauca- 
sian haplotype, and our sample set included 
eight African samples having a large number 
of clustered differetKCS to P^. Furthermore, 
the variation in the hypervariable part of the 
control region is much higher than for the 
rest of the mitochondrial genome and for 
nuclear genes in general (Fig. 2 shows com- 
parisons to African samples in this region). 



Thc determination of a complete human 
mitochondrial DNA sequence more than 15 
years ago has had a uemendous influence on 
studies of human origins and evolution and 
the role of mutarions in degencrarive diseases 
(8. 10, 19). Becmse of the cose and difficulty 
of conventional sequence analysis, most sub- 
sequent sequencing studies have focused only 
on x\vo small hypervariable regions totaling 
-*600 bp (9). However, access to the entire 
genome is rcquiied for n foil understanding of 
the governing genetics. We therefore de- 
signed a p-''*^ riling array for the mitochon- 
drial genome. The array contains a total of 
136.528 synthesis cells, each ^35 ^juii by 35 
)jun in size (Fig. 3). In addition to a 4L tilii\g 
across the genome, the array concair\5 a set of 
probes representing a single-base deletion at 
every position across the genome and sets of 
probes designed to match a range of specific 
mtDNA haplotypes. Using long-range poly- 
merase chain reaction, we amplified the 16.6- 
kb mcDNA directly from genomic DNA sam- 
ples (20). Labeled RNA targets were prepared 
by in vitro transcription arul hybridized to die 
array. Genomic hybridization patterns were 
imaged in less than 10 min by a high-resolu- 
tion confocal scanner (2 1 ). 

The hybridizatioa pattern of a 16.6-kb 
target to the mitochondrial genome chip is 
shown in Fig. 3. Although there are some 
regioi^s of tow intensity, most of the 2.5- 



nucleotnde oligomer array hybridized effi- 
ciently: Simply by identifying the highest 
intensity in each column of four substitu- 
tion probes, 99.0% of the mt3 sequence 
could be read correctly (P^ sequence = 
mc3). The array was used to successftilly 
detect three disease-causing mutarions in a 
mtDNA sample from a paiieni with LebeT*s 
hereditary optic neuropathy (22, 23) (Fig. 
3C). in addition, we detected a total of 
seven errors and new polymorphisms from 
previously unscquenced regiorw. 

We then hybridized 10 genomes from 
African individuals to die array ar\d unam- 
biguously identified 505 polymorphisms. 
These were polymorphisms that could be 
clearly read and for which a confirmatory 
footprint was detected automatically. For 
the 10 samples, the 2.5 -kb cytochrome b 
atKl control region sequences were known 
{ /7). No false positives were detected in the 
-^25 kb of sequence checked in this way. 
Additional clustered polymorphisms were 
detected by the presence of footprints hut 
not read directly. A detailed analysis of the 
polymorphisms in these genomes, and oth- 
ers, will be presented elsewhere. 

The throughput of a conventional gel- 
based sequencer, with an average read 
length of 4(X) nucleotides and 48 lanes 
that is run twice a day, might be two 
mitochoiutrial genomes a day at best. In 
contrast, the throughput of the nonopti- 
mized system we describe is five chips per 
hoiu:. Thus. 50 genomes can be read by 
hybridization in the time it takes to read 
two genomes conventionally. Further- 
more, there are significant reductions in 
sample preparation requirements because 
the entire genome is labeled in a single 
reaction, so the cose is si mi tat lo that for a 
single sequencing reaction. Also, sequence 
reading at the level of data analysis is 
automated: The sequences can be read in a 
matter of minutes. No arv»lytical separa- 
tions or gel preparation is needed, which 
contributes to the speed of the experi- 
ment. Although the inability to read all 
possible seqvtences is a weakness of the 4L 
tiled array, it is not a major limitation, 
because in practice the small number of 
ambiguities can be checked by targeted 
conventional sequencing. In particular, 
highly repetitive sequences, such as long 
runs of a single base, are presently best 
analyzed with conventional technology. 
Finally, a clear advantage to the approach 
we describe is chat it is highly scalable. 
The cost, effort, and time required to an- 
alyze the entire 16.6-kb mtDNA in a sin- 
gle experiment is virtually identical to 
that required to read 2.5 kb. Tliis provides 
a clear path to further orders-of-magnitude 
improvements in efficiency. 

High-density oligonucleotide arrays 




Fig. 2* Detection of base 
differences in a 2.5-kb 
region by comparison of 
scaled P° hytaridizalton 
irtensity patterns t>e- 
tween a sampte (green) 
and a reference (red) se- 
quence. (A) Comparison 
of sequence ief007 to 
mtl. In the region 
Shown, there is a single- 
base cfitfa«nce twtween 
the two sequences, k>- 
cated at position 16.223 
(C in mtl. T in »ef007). 
This results in a 'foot- 
prait* spannif^ -20 po- 
sitions, 11 io the left and 
8 to the right of position 
16.223, in wt^ch the 
I8ff007 intenaties are 
decreased by a factor of 

more than 1 0 on average reiatrve to the mt1 intensities. The predicted footpr^t tocatton is tndiceted by the 
gray Ijar. arKJ the kxation of the polymofphian is shown lay a vertical btack fine within the bar. The size of 
3 footprint changes with probe length, and its relative position with substitution position (not shown). (B) 
Comparison of sequOTcehaOOl tomt1. ThehaOOl target has fourpofynxjrphisms relatve lo mtl. The P° 
■ffiler^ pattern dearty shows two regions of difference between the targets. Each region contains two or 
moredtrtsr^Kes, because in t)om cases: the footprirtts are longer than 20 positions and therefore are 
extensiva to be exptained by a single-base difference. The effect of competition can be seen by comparing 
the mtt intensities in the tsfCX77 and haOOi experiments: The relative tnt&^sfttes of mti are greater in (B) 
where haOOl contairis mismatches but tefD07 does not. (C) The haOCM seQuenoe has nrujitipfe 
cfiffefcnces lo mtl, resulting in a complex pattern extending over most of Ihe regon shown. Thus, 
differences are ctearty detected Because hybridizalion intensities are extreme sequence-dependent, 
each of the mttochondria) sequences can also tse identiSed simply by its hybridization pattern. 
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Rg. 3. hftjmafin mito- 
chondrial genomo on a 
ch^. (A) An image of the 
»ray hybricfeed to 18.6 
kb of mitochoncfriaf target 
RNA (L strand). The 
16.5G9-OP map of the 
genome shown, and 
the H strand t^igjn of rep- 
Scation (C^. located in 
the control region, is indh 
cated (B) A portion of 
the hyt^idizstion pattern 
magnilted. in each col- 
umn there are five 
probes: A, C,G, T.a^A. 
from top to bottom. The 
& probe has a single- 
t)ase deletion instead of a 
substitution and henoe is 
24 instead of 25 bases in 
ter)gth- The scaJe is indi- 
cated by the bar beneath 
the irnaoe. Mhou^ 
there is considenabte se- 
qufflTce-dependent in- 
tensity variation, most of 
the array can be read di- 
rectty. The inrtage Vi*as 
cotJected at a resofution 
of - 1 00 pixeis per probe 
ceB.{C)The^tyofthe 
array to detect and read 

sln^base <fifferences fri a l6.6-kb sanpte is fflustialed. TvwotSff^wt t^et sequences were hytyitflzed 
in parang to diff^t cNps. TTie hybridization patterns are corrpared for four <*ffeent positions in the 
seqiffince. OnVtheP^^-'^ probes areshowa The top panel of each pafe- shows the hytjridlzation of tf^ 
target, vytfiich matches the c«p P° sequence at these positions. The lower panel shows the pattern 
gerwrated by a sample from a patient with Leber's hereditafy optic neuropathy {UHON}- Three known 
pathogenic rr^rtatcns. l>iON3460. LHOW216. and LHO(sl13708. are dearfy defected. For comparison, 
t^le fourth panel in the set shows a region around posftkyi 1 1 ,778 that is idert 




4 imm 



provide the foundation for a powcrfiil ge- 
netic analysis technology. The method 
can be used to characterize the spectrum 
of sequence variation in a population and 
can be apphed to the analysis of many 
genes in parallel. In the case of human 
mtDNA, we simultaneously analyzed the 
control region, 13 protein coding genes, 
22 tRNA genes, and 2 ribosomal RNA 
genes. The methods described here can be 
applied to other research areas in molec- 
ular genetics; for example, the ability to 
identify and sequence polymorphisms pro- 
vides a basis for genetic mapping. Tlie 
specificity of oligonucleotide hybridiza- 
tion and the scalability of the method 
suggests the possibility of a dedicated array 
that could be used to generate a high- 
resolution genetic map of an entire ge- 
nome in a single experiment. Likewise, 
the concepts and techniques described 
here have been used to develop approach- 
es for mRNA identification and the large- 
scale, parallel measurement of expression 
levels (24). Thus, the sequence of a gene, 
its spectrum of change in the population, 
its chromosomal location, and its dynam- 



ics of expression (all essential to a full 
understanding of function) can be deter- 
mined with high-density probe arrays. The 
challenge now is to synthesize and read 
probe arrays at even higher derisity. For 
example, a 2 cm by 2 cm array, synthesized 
with probes occupying l-|tm synthesis 
sites in a 4L tiling, could query the entire 
coding content of the human genome, 
estimated at lOO.OCX) genes. 
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for 3 hours wtth rotation at 60 rpm. The chip was 
washed six times at room temperature with 6 X SSPE 
(0.9 M l^taa. 60 mM NaHaPO^. 6 mM £DTA pH 7.4). 
0XX6% Titon X-IOO. Phyooerythrirvconiugated 
streptflvitfin (2 (tgArt in 6x SSPE. 0.00596 Triton 
X-1 00) wes added ^idtnoi^ion cominuadtf room 
lerrperature fdr 5 min. The chip was wa^ied egesn 
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andscamedataresoWtonol -74 iwete per probe 
cea. Two scans were coDected: a flooresceftrt scan 
was obtained wttfi a 5l 5- to 545-rim oarvHJass mer, 
and a phycoefythrtn scan with a 560-nm long-pass 
finer. Sgriis were separated to renKuve spectral over- 
lap and ER/orage counts por coa determined. 
14- Each 2.5-kb target sequence was PCR-anTpKiedcfi- 
rectV from oenofYw; DNA vwth the primer pair 
Lt467S-T3 (S-aattaaccctcaciaaaggflMTCTCG- 
CACGGACTACAAO and H867-T7 (M). 

15. To sc^ the sample to the reterence titensfties* we 
constrxjcted a histogram ot the base 1 0 logaritnm of 
the intensity ratios for each pair of probes. Tbe his- 
togram had a mesh sfze of O.Ol and was smoothed 
by repiadng the value at each point \Mth the average 
number of coLB^ts over a frve-point wtndow centered 
at that point. The haghest value in the histogram vwas 
located, arKf the resulting Hensfty ratio was taken to 
be the most prottable calibratton ooeffidenl. 

16. Base tdentification was accompBshed with a Bayes- 
ian dassiScation aigofiihm based on variable kernel 
density esltmation. The IkeBhood of each idantifica- 
tion essodated with a set of hybrtdizaton intense 
va^ was cotT^ted by comparing an unknown set 
of probes to a set of example cases for which the 
correct base Weraiflcation was known: The resulting 
(bur (ikei3>oods were then nomrvaSzed so that they 
summed to 1. Data from both strands were com- 
bined by averaging the values. If the most fikety base 
tdemifjcstton nad an average normattzed Scec^iood 
greater than 0.6, il was caBed. otherwise the base 
was ca&ed as an ambiguity. The example set was 
derived from two different samptes^ toOIS and 
ier005. whk:h have a total of 35 sut>stituttons ret^'fva 
to ml 1, of which 19 are shared with the 12 samples 
anafyzed and 16 are rxX. ktentt^tton performance 
was not sensitive to the choice o1 examples. 

17. To provide an independently oetenTiined reference 
sequence, each 2.5-kb PGR ampteon was se- 
quenced on both strands by piimer-cfirected fluores- 
cent chatrv-termir^or cycte sequencing with an ABi 
373A DNA sequencer and assembled and manuaCy 
edited with Sequencher 3,0. The ana^'s presented 
here assumes that the sequence amptified from 
genoinic DfslA is essentially clorwl |R. J. Monnat and 
L A. Loeb. Proc. Natt. Acad. Sci. USA. 62, 2895 
(1985)] and that Its detenninatfon by gef-tiased 
methods is correct. A frequent length polymorphism 
at positions 303 to 309 was not detected by hybrid- 

• Izaiion under the corditiorts used, tt was excluded 
trom anolysfs and is not part of the set ol 180 pofy- 
morphisms (Sscussed in the text. However, pdy- 
morphtsms at this site have pre^nou£ly tieen dlKeren* 
tialed by otigonucleotkfe hybrkltzatk>n [M. Stone- 
king. D. Hedgecock. fl. G. Wguch*, L. Vigtoni, H. A. 
Effich, Am. J. Hivn. Gemt. 48. 370 (IMlJj. 
IB. The P° intensity footprints were detected in the fol- 
lowing way: The reference and sample intensities 
were non-nalized (t5). and R the average of 
toOfP^^te™^ w*.) <^ a window of five posf- 
tions. centered at me base of tnterast. was cak^uiai'- 
ed for each position in tfie sequence. Footprints 
were detected as regions having at least five contig 
uouspos^tlorts with a reference or sample tntertsityat 
(east 50 counts above backgromd and an R vabe in 
the top 10th percentile for the experimem. At 2CS 
polymorphic sites, where the sarr^ was itiTs- 
matched to the mean R value was 1 .01. with a 
standard deviation of 0 57. At 35.333 nonpolymor- 
phic sites (thai ts. where txjth retererxn and sample 
had a perfect match to the mean value wcs 
-0JO5, with a standard ctevtatfon of 0,26. 

19. R.LCamM.Ston^«ngAC.WIson,/Stefure325,3t 
(1987); M, Zeuani ef Ani. J. Hum. Genef. 47. 904 
(19901; CX C Wsiace. A/m f^. Bkxtem. 61. 1 1 75 
(1992): S. Horai. K. Hayasaka R Kondo, K. Tsugane. 
N. Takahata. Proc. NatL Acad ScL US.A 92. 532 
(1995); T. HmcNn and Q. Conopassf. toW.. p. 6892- 

20. long-fange PGR ampiiScation was carried out on 
genomic DNA with Palon-Bmer GeneAmp XL PCFI 
reagents acoorcflng to the manufacturer's protocol. 
Primers were 11 4e36-T3 (5'*aattaacoctcactasB^gAT- 
GAAACTTOSGCTCACTCCTTGGCG) and RH1066- 
T7 {S'-taataogactcactalagggaTTTCATCATGCGGA- 
GATGTTGGATGGX based on RH 1066 ES. Cheng. R 



Hfguchi. M. Ston^dng. AJ^rura GeneL 7. 350 (1994^. 
Each 100-»Jreactionoontained 02 (tM concentration 
of each prtmer arxl ^ 10 to 50 ng of total gsnomic 
ONA Transcription reactions were carried out in 10 ^1 
with Ambion MAXIscript Mt according to the manufac- 
turers protocol. The concentration of the \ 6.6.kb PCR 
template was -2 nM. arxj the reaction contained Am- 
bibn IX btotfrv14-CTP/NTP mix and 0.2 mM biotin- 
16-UTP. incubatfon was at 37*C for 2 hours. Frag- 
rrsmtation and fyytvkfizstion were as described (f 3). 
except that 3.5 M TMACt and the bfotin-tabeted o8go- 
nudeotide S'-CTGAACGGTAGCATCfTGAC were 
used in the hytirfdlzatkyi buffer, which gtso oontamd 
fragmented baker's yeast RNA (100 ^g/Vnf) (Sgma). 
Hybrfctoton was carried oul.al 40*C for 4 hours. 
21. A custom teieoentnc obfecth/e tens with a numerical 
QpertuTB of 0.25 focuses 5 mW of 488-nm argon bser 
to a 3-iLm-diameter spoL wfich is scanned t)y a 
gahonorneter rr*TDr acroes a I4^nm ISeW at 30 ines 
per second. Fiuoresoence cdlected by the objective is 
descanned by the gaJvenomeler mirror, filtered a 
dk^Yoic beernspGtter {555 nm) and a band-pass fSter 
{S55 to 607 nm), focused onto a confocal pinhole, 
and detected by a photomutttpSer. Photomulliptfier 
outpiit is digtti26d to 12 bits, A 4096 by 4096 pixel 
sna^ is obta^ m less thar> 3 \vtn. Ptxei size is 3.4 
^m. The data from four sequential scans were 
summed to improve the signal-to-noise ratio. 



The nucleosome has an active n."»le in gene 
regulation, Mutaiitnw of the core histones 
have specific consequences foT the tran- 
t>cription of particular genes (J). The spec- 
ificity of these effects can he explained bodi 
by cl'ke fHTSitiontng of histones with respect 
to DNA sequence (2) and the potential 



D. Puss and A P. Wo«e. Ljaboratory of Molecular Env 
tKyotogy. fMattonal (nstttute of OHd Health and Human 
Oevelofiment. f^tionat institutes of Health. Budding 6. 
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Biochemistry. Southern tOnois University at Carbondaie. 
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J. Hayes. Department of Biochemtetiy. School ol Medi- 
cine and Dentistry. University of Rochester, Rochester, 
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22. M. 0. Brovwn, A. S. Vdjavec. M. T. Loit, I. Mac- 
DonaJd, 0. C. Wafiaca FASEB J. 6. 2791 (1992). 

23. MBochoncrtel DNA popuiaiJons can contain mora than 
one sequence type, in a conditian loxawn as tietao- 
ptasmy. The mutaiions shown in Fig. 3C were 
chaiac tg rizsd as txinQ homoptssmtc tv convemionQl 
sequencing and restriction endonudeasa digestion (M. 
&owva pCH^onaf oommunicstior^. In cortroBed trbinQ 
experbnents. we have shown that sequences present at 
the level of 10% can eastV be detected by tiybridlzation 
(M. Chee and R. Y&ng, uwbeshed resutta; N. Shen. 
personal communicatfon). The sansiiMty of detaction Is 
seQuenoe dependent. Importantly, hybridtzation can l3e 
used to delect heterozygous nudto DNA sequences 
W. Hada ef af.. m preparation). 
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targeting of histone modifications to partic- 
ular nucteosvHnes (3). ThiiSt an understand- 
ing of nucleosomal architecture is central to 
ufuicrstaruJing the cranscrtpcion process. 

The nucleosome contaiiis vwa molecules 
of each of the four core histones (H2A, H2B, 
H3, and H4)» a single molecule of a linker 
histone (Hl» H\\ or H5). and -180 base 
pairs (bp) of DNA (4). In isolation* the core 
histones assemble into an octameric complex 
{5)» whose stntcture has been dctcnnined at 
3.1 A resolution (6-6). The exact path of 
DNA on the surface of rtie histtmc occamer» 
the posittc^n of the linker histone molecule 
within the mtcleosome, at>d the path of linker 
DNA between adjacent nucleosomes (9-1/) 
remain to be determined. 

We vised positioned nuclcosomes con- 
taining the Xcnopu5 boreaUs somatic 5S ri- 
bt^imal RNA (rRNA) gene to examine 



An Asymmetric Model for the Nucleosome: 
A Binding Site for Linker Histones Inside th 
DNA Gyres 

Dmitry Pruss, Blaine Bartholomew, Jim Persinger, 
Jeffrey Hayes, Gina Arents. Evangelos N. Moudrianatds, 
Alan P. Wolffe* 

Histone-DNA contacts within a nucieosome influence the function of trans-acting factors 
and the molecular machines required to activate the transcription process. The internal 
architecture of a positioned nucleosonne has now iDeen probed with the use of photo- 
activatable cross-linking reagents to determine the placement of histones along the DNA 
mdecule. A nnodel for the nucieosome is proposed in which the winged-helix dornafn of 
the linker histone is asymmetrically located inside the gyres of DNA that also wrap around 
the core histones. This domain extends the path of the prote^ superheHx to one side of 
the core particle. 
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Genetic analysis f amplifi d DNA with immobilized 
sequence-specific oligonucleotide probes 
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ABSTRACT Tt» audysb of DNA f6r the fvesence of 
partknlar nrntattoos or poljrmofplibms can be rewiOy acomi- 
pushed by dHTerential bybrtdizatioii with seqaesce-fipedfic 
^gonodcotlde probes. Tbe in fUro DNA iim|tftflfwtkm tech* 
oiqttet tbe polymerase chain reaction (FOR), has facfiitated the 
Qse ctf these probes by greally Increasing Uie niunber of copies 
^ target WA in the sample prior to. hybridizatko. In a 
ctmrentkmai assay with Imntolfflzed PGR product and labeled 
oligomideotlde i^bes, eadk pn»bc ivfjoins a separate lgrl»^ 
hstkm. Here we describe a mefliod hf wbkh aw can sfaanl- 
taneoesty soten a sample for ai known eOdic nuiaals at an 
amplified locus. In this format, the oUgonudeotides are given 
faontqidymar tails wHh terminal deooqrrfboinicieotidyltrans* 
tams/t, spotted imto a nylon membrane, and ooralent^ brand 
by UV trra^iatloiL Dne to thdr loag length, tbe ti^ are 
^tferoitially boond to tbe nyfon, Icavltts the oitgonsckotldt 
pr^ free to hybridize. The target segDMitf of the WA sanqde 
to be tested ts PCR*ampBfied with biotbyi^ printm and 
then hybridized to the membrane containing the fanmoblQzed 
digoandeotldcs mider stringent omdHioas* Hybridizatkn Is 
itetected ponredtoacttvely by Modhtg of streptaridla-horsmid* 
Ish peroxidase to the biotliq^ted DNA, followed by a simple 
ootorfanetrtc reactloa. lUs tedmlque has beeb applied to AM* 
i>QA genols^teg (sts types) end to the detecthm of Biediterra- 
I ^-thidassemla miitaflons (nine alleles). 



Differe&tiai hybridization with sequence-specific oiigonude- 
<ttide probes has become a widety used technique for the 
detection of poetic matattons and polymori^usms (1-5). 
When hybridized under the s^rc^pdate conditions, these 
synthetic DKA probes (usually 15-20 bases, in length) will 
anneal to their complementary target sequences in tbe sample 
DNA only if they are peifecUy matched. In most cases, the 
destabilizing effect of a single base-pair mtsnntch is sufficient 
to prevent the formation of a staUe probe-taiget dujplex (6). 
With an appropriate selectioo of oHgooudeotide probes, the 
kxlevant gmtic content of a DNA sample can be onnpletely 
itescnbed. 

This very poweriUI method of DNA analysis has bem 
greatly simphfted by the in vitro DNA-anq^ficatioa tcch- 
oiqite, the polymerase chain reaction (FCR) (7-9). The PCR 
can selectively increase the number of copies of a particular 
DNA segment in a sample by many orders of magnitude. As 
a remit of this 10^- to iCF-foId amplification, mere conveotent 
assays and noniadtoactive detection methods have become 
possible (1(^12). These PCR-based assays are usuaOy done 
by amplifying the targ^ segment in the sam^^e to be tested, 
fixing the amplified DNA onto a series of nylon membranes, 
and byhndizing each membrane with one <^ the labeled 
oh^Rudeotide probes tmder stringent hybridization condi* 
tions. However, each probe must still be individu^y hybrid- 



The pubiic&ticm GosU of this article wen defiayed in part by p9se 
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ized to the amplified DNA and the process can easily become 
dsfiGcult to a system where many dtfTerent mutations or 
polymorphisms occur. 

One approach to address this procedural difficulty is to 
"reverse** the DNAs: attach the oligonucleotides to the 
nylon support and hybrklize the amplified sam^ to the 
membrane. Thus, m a single hybridization reaction* an entire 
series d sequences could be analyzed simultaneously. The 
strategy we adopted was to imoiobHize the oligonucleotides 
onto nylon filters by ultraviolet fixation. Exposure to UV 
tight activates thymine bases in DNA, y/hkh then covalently 
couple to the primary amines present m nylon (13). It seemed 
unlikely, however, that shcot oligonucleotides could be di- 
rectly attached to nykin in this manner ami still retain their 
Skbili^ to discrimimde at the level of a single base-pah 
mismatch. Consequently, the addition of a long deoxyiibo- 
thymidine hpmopolymer taS, poty(dll, to the 3' end of the 
oligonucleotide apj^eared promising for several reasons. 
First, the pcHyidT) tail would be a larger target for UV 
CTOsrfink lng an d should preferentially react with the nyloo. 
Second, dTTP is very readily incorporated onto the }' ends 
of digonucleotides by tcri&iiial deoxyriboxuideotidyltrans- 
ferase and would pennit the synthesis of very long tails (14). 
(Deoxyribothymidtoe would abo be the most efildentfy 
mcbrpKirated base if a ^utly synthetic route were chosen.) 
Third, Collins and HunsaHer (15) had shown that the pres- 
ence of a pbly(dA) homopdymer tail, used to introduce 
omitiple '^S labels, did not afiect the fiinctibn <tf sequence- 
specific oli^ucleotide probes. 

We have used this technique to attach oligtmudeotide 
probes specific for the six msiixr HLA-DQA DNA types (16) 
and the ei^ most common Mediterranean JS-tfaalassemia 
mutatbns (4) to nyUm filters. The target segment of the DNA 
sample to be tested (eidier HLA-DQA or J9-globin) was 
amplified by PCR with biotin-labeled primers to introduce a 
nonradioacttve tag. HylmdizatiOD of the amplified product to 
the immohOized <4igoiiucleotides aitd tnnding of streptavidin- 
horseradish peroxidase coryugate to the biotinyteted primers 
were perfohned simultaneously. Detection was accom- 
plished by a simple ct^orimctric reaction involviqg tbe en- 
zymatic oxidation of a cdwtess chrcmtogen tiiat yielded a red 
color wherever hybridization occurred. 

MATERIALS AND METHODS 

TidUng of QgjgonudcotMfs. Ofigonudeotides were synthe- 
sized on a DNA synthesizer (model 87(X), Biosearcb) with 
^<yanoethyl N,N-diisopropy^hosphc»amidite nucleosides 
(American Bionetics, llayward, CA) by tising protocols 
provided by the manufacturer. Olusonucleotide (200 pmol) 
was tailed in 100^1 of 100 mM t>otasshim cacodyiate/25mM 
Tris HCl/1 mM CoCl3/(^.2 mM dithiothrettol, pH 7.6 (17), 
with 5-160 nmol deoxyribonudeoside tr^osi^tate (dTTP or 

Abbreviatioa: PCR, polymerase chain reaction. 
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dCTP) and 60 uniu (50 pmol) of tenninal deoxyribonacleoU- 
dyltran^erase (Ratliff fiipchenucals, Los Alamos. NM) for 
60 min at 37^. Reactions were stopped by addition of 100 ii\ 
of 10 mM £DTA. The ien^ s of the hontt>polymer tails were 
controlled by limiting dTTP or dCTP. For exampJe , a nominal 
taO lengt h of 400dT residues was obtained by using 80 nmol 
of dTTP in the above reaction. 

Preparation of FUtm. The tailed oligonucleotides were 
diluted into 100 /d of T£ (10 mM Tris*HCl/0:i mM £DTA, 
pH 6.0) and applied to a nylon membrane (Oenetrans-45; 
Plasco, Wobum, MA) with a spotting ntanifc^d (BtoDot; 
BioRad). The damp filters were then ^ced on TE-soaked 
paper pads in a UV light box (Stiatalinka- 1800; Stratagene) 
and irradiated at 254 nm. Dosage was controlled by the 
device's internal metering unit. The irradiated membranes 
were washed in 200 ml of 5X SSPE (IX SSPE is ISO.mM 
Naa/10 mM NaH2P04/l mM EDTA. pH 7.2) with 0.5% 
NaDod504 for 30 min at 55X to remove unbound dtgooa- 
deotides. If not used immediately, the {iltm were rinsed in 
water, air-diied, and stored at room temperature until 
needed. 

Amplification <rf DMA. PCR amplification of genmmc se* 
quences was performed by a slight modification of previously 
described procedures (9). DNA (0.1-0.5 /i^ was amplified in 
100 fd containing 50 mM KQ, 10 mM Tris-Ra (pH S.4), 1.5 
mM MgCrij, 10 ^ of gelathi, 200 iM each dATP, dCTP, 
dOTP, and dTTP, 0.2 iM each bbtinylated amplification 
primer, and 2.5 tmits of Thtrnaa aqukticut {Tnq) DNA 
polymerase (Pvlon-Ehntf /Cetus). Ihe cycling reaction was 
done in a programmable heat block (DKA Thermal Cycler, 
PerldQ-Elmer/Cetus) set to heat at 95°C for 15 sec (denature), 
cool at 55'C for 15 sec (anneal), and inct]d>ate at 72°C for 30 
sec (extend) by the "Step-Cydc" imigram. After 30 repeti- 
tions, the samples were incubated an additional 5 min at 72^. 
The primers contained a single molecule of biotin attached to 
the 5' end of the oligonucleotides (described below). 

Hybrldlzathmand Detccttott of Ampmicd DNA. Each filter 
with bound oligonucleotides was placed in 4 ml of hybrid- 
ization sohition containing 5x SSPE. 0.5% NaDodS04. and 
400 ng of streptavidin-horseradish peroxidase coq)ugate 
(See(}uence; Eastman Kodak). PCR-ami^cd DNA (20 pX) 
was denatured by addition of an equal v<^ume of 400 mM 
NaOH/10 mM EDTA and added immediately to the hybrid- 
ization sohition. which was thcnkcubated at 55<t:f<M' 30 min. 
(During this incubation, hybridization of PCR |»oduct to 
immobilized oUgonudeotide and binding ci streptavtdin- 
hmeratfi^ peroxidase to btotio present in the PCR product 
occur simultaneously.) The fflters were Mefiy rinsed twice in 
2x SSP£/0.1% NaDodS04 at room temperature, washed 
once in 2x 5SPE/0.1% NaDodS04 at 55*0 for 10 mis* and 
then briefly rinsed twice in 2x PBS (ix PBS is 137 mM 
Naa/2.7 mM KQ/S mM Na3HF04/1.5 mM KH1PO4. pH 
7.4)at roomten^terature. Color devdippment was performed 
by incubating the titers in 25-50 ml of red leuco dye 
(Eastman Kodak) at room temperature for 5-10 min. Photo- 
gn^ihs were taken for permanem records. 

^thesis (tf Biottnylttted Oligenncteotlde Primers. Primary 
amino groups were introduced at the 5' termhti of the fdmers 
by a variation of published ]m)oedures (16, 19). In brief, 
telraetfaylene glyccrf was converted to the monophthalimido 
derivative by reactmn with phthaiimide In the presence of 
triphenyl^os^ine and dtisopropyl azodicarboxylate (20). 
The numopfatfaalimide was converusd to the ccHrrespondirig 
0-cyattoeth)i dlisopropylamino phosphoramklite by standard 
protocob (21). lite resulting phthalimtdo amidite was added 
to the y mds of the oligonucleotides during the final cyde 
automated DNA synthesis by using stan^tol coi4>lkig con- 
ditions. During normal deprotection of the DNA (concen- 
trated aqueous ammonia for 5 hr at 55°C)r the phthalimtdo 
group was converted to a primary amine, which was subse- 
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q^endy acylated with an appropmte biotin active ester. 
NHS-LCrbiotin (Pierce) was selected for its water siriubiHty 
and lack of steric hindrance. The biotinylaticm was performed 
on crude, deprotected oligonucleotide, and the mixture was 
purified by a combination of gel filtration and reversed-phase 
HPLC. Additional details of this procedure will be published 
elsewhere (22). 

RESULTS 

Bhufing and HybriAzatlon Effickncy of Tailed OU«mncie- 
otidct. The relative efiiciendes with which synthetic o^go- 
oucteotides with homopdymer tails of various lengths were 
covalently bound to the nylon filter were measured as a 
Aincttott of UV exposure (Fig* 1 LtfO* Oligonucleotides with 
longer poly(dT) tails were more readily fixed to the mem- 
brane, and all attained their maximum values by 240 mJ/cm' 
of irrsKliation at 254 om. In contrast, the (dC)4oo-tail«l oUgo- 
nucleotide required more irradiation to crosslink to the nylon 
and was not comparable to the equivalent (dD4(» construct 
even after 600 tnJ/cm^ exposure, lliis difference is consistent 
with the findings of Church and Gilbert (13) that sixggested 
light-activated tiiymtne bases bind more effectively to nylon 
than do cytosine bases* The untailed oligonucleotide was also 
retained by the membrane in a manner that roughly paralleled 
the poty(dQ product 

Efficient binding of digonucleotides to the membrane, 
however, does not necessarily correlate with hybridization 
efficiency, and so hyt»idization efficiency as a function of 
UV dosage was determined in a separate eiq^eriment (Fig. 1 
Right). These results show a distinct optimum of exposure 
th£d changes with the length of the poly(dT) tail and ts mm 
sharply pronounced for the long^ tails. Aidditional experi- 
ments have shown the optimal dtmges to be about 20 ml/cm' 
for the (dt}Mo and 40 mJ/cm' for the (dT)4oo oUgonucleotkles 
(R.K.$,, unpublished observations). Hie peak efficiencies of 
the (dT)4oo and (dTkoo constructs are around 1% (45-50 f mol 
of radidabeled probe annealed to «>3.5 pmol of tailed oligo- 
nucleotide), which is similar to the value reported by Gamper 
et at: (23) for an oligonucleotide probe hybridized to nyloiw 
bound plasmid DNA. 

Conmarison of the data m Fig. 1 Lefi and Right for 60 
mS/cm^ irradiation indicates that oligonucleotides with 
longer tails hybridize more effectively than can be accounted 
for by the additional amounts bound to the filter. This 
suggests a spacer effect wherein the poly(d77 tafls improve 
hybridization efficiency by increasing tht distance between 
the nykm membrane and the tenninal oligonucleotide probe. 
Besides possiUe UV damage to the DNA itself, additional 
exposure causes more of the tail to become attached to the 
membrane, thus reducing the average spacer length and 
decreasing hybridization efficiency, lite markedly different 
hybridization profile of the polyfdQ oligonucleotide is com- 
patible with this interiHretaticm. Because cytosines react less 
^Sciently with the filter, hyl^ization efficiency reaches a 
plateau where loss due to UV damage and tail shortening are 
oompensated by the fixing of new molecules (see Fig. 1£<^). 
This characteri^ of cytosine may make a pdy(dQ tail 
desiraMe when UV irradiatioo cannot be carefully controlled. 
Under the stringent hybridization conditions used in this 
experiment, no s%i^ was detected for the untailed oligonu- 
cleotide. 

WA Typing at theHiA*DQA Locus. The HLA-OQA test is 
derived firom a PCR-based ol^omicleotide typing system that 
partitions the pdymorphic variants at the DQA locus Into 
four ffl^ DNA types, OQAI to DQA4t and three DQAI 
subtypes^ DQAJJ to DQAIJ (16). Four oligonucleotides 
specific for the m^ DQA types, four oligonucleotides that 
characterize the DQAJ subt^>es. and one control oUgonu- 
deotkie that hybridizes to all alleUc DQA sequences CTabie 
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Pio. 1. Filter retenttoD and hybrklizatioD ef!icteficy of tailed o ligonuc leotides as a ftmctkM of UV dosage and tail length. iL^) Filter 
retentkm. A 19-bBse otigomideotide. 19A (S'-CTCCTGAOG AGAAGTCTGC*V>, was S'-end-labdcd with ^ by usins phage T4 polyimcteotide 
kinase and [r-^P}ATP (10). Portiotts of the labeled oligonucleotide wen given V bomopolymer tails, with tenmoat tteoxyribonudeotidyttrBas- 
ferase and either dTTP or dCTP. Hie base compositions and lengths of the tails were as follows: (dlV (dlTu* (dT)so» (dTHco* (dT)2(». (dTW 
(dT>ao> and (dCjiMo. Four pfeonuto of eachoUgonudeotide was spotted onto nine duplicate filters, UV imdiated for various times, end washed 
to remove unbound otigonucleo^des; each spot then was measured by scintilhuion counting to determine the amount crossUnked to the nylon. 
The values plott^l are relative to an oatiradiated, unwanted control filter (100% retenticm). iRtgfu) Hybridizatton efficiency. Fatcrs costaioing 
taDed. but unlabeled, 19A were pre^iaicd as described above and hybridized under sequence-specific conditions (see Materhls and Methods) 
with a ^-labeled 4<M}ase oUgonucleotide, RS24 (5'<X:CAC:A(KKK:AGTAA(XK}CA0ACTICKriC^ complementary to 

19A. The specific activity of the RS24 was U5 |tO/pmol (1 iiCi » 37 kBq). Each spot was assayed by sdntithtion co<intfag, The values plotted 
are ftnol of RS24 hybridized to the men^rane. 



1) were gtven 400-base poly(dT) tails and spotted onto n^on 
Qters. Tbe sequence variation that defines the DQA types is 
localized within a relatively small '*hypervatiable*' regicm of 
the second exon (24) thai caji be encompassed within a single 
242-base-pair FCR amplification fragment* Biotioylated 
primers (Table 1) were used to am^ify the DQA fre^ment 
from several genomic DNA samples: six homozygDUs cell 
lines and six heterozygous individuals. After hybridEcation of 
the amptified DNA to Ute membranes and color develc^mmU, 
Uie DQA genotypes of these samples were readily apparent 
(Fig. 2). 

TaMe 1. Sequences of oHgonockotlde primen and probes 



Although most otatit^ oligonucleotide pct>bes are uniquely 
specific for one DQA type, two of the DQAJ subtyping 
probes cross-hybridtze to several DNA types. OH^ hybrid* 
izes to a sequence common to the DQAJ^, DQAiJ* and 
DQA4 types, and the probe OH76 detects all DQA t)pes 
excepti>(2A/ J* CThelatta- is needed to distinguish Z>0Ai .2/ 
I J heterozygotes from DQAIJ/iJ homozygotes.) The 
length and strand specificity of tlie oUgonudeotides were 
empirically adfjusted until their relative hybridization efOdeik- 
cies and stringency requirements for alteHc discrimination 
were approxixnatdy the same. (This was achieved by deter- 



Wame* Functtoo Sequence 

R5151 DQA prater bOTGCTOCACKrrGTAAACrTGTACCAGt 

RS152 DQA primer bCACGGATCOOGTAQCAGOOGTAOACmQt 

RH54Q) M DQA types CTACOTOOACCTOOAOAGOAAOGAOACTOCCTG 

GH75(4) DQAi pnAt CTCACXKXXXOOCAOQCA 

RHTl H) DQA2 probe TTCCACACACTTAOATTrGAC 

OK«7 (4) DQAS prabe TrCCQCAOATTTAGAAQAT 

GH66(4) 0eA#probe 10 M m cCrOITCTCAOAC 

CH88(4) DQAtJptdbt OGTAOAACTOCTCATCTCC 

OH»9(4} DQAU.*tJ,'4' OATOAOCAOTTCTACOTOO 

OH77(4} DQAJJpnJbc CTOGAaAA(UAG(3AGAC 

OH76 (4) Not DQAiJ GlCiC UIiCCICICLAO 



Name* FUnctioo Sequence 

RsSi ^Obb primer b-ATCACmAGA(nCA(XXrrQt 

RS152 ^<Hobte primer b^CCTOCCAC Al ICtCi 1 1 1 ^ 

RS187 (8) Konnal fi^^ TAOACCAATAOOCAGAOAO 

RSISS (S) Mmam CrCTCTOCCTATrAOTCn'A 

RS87(4> Normal^ CCTTGGAOCXAGAGGTTCT 

RS89(4) Mutant^ AOAACCTCTAOGTGCAAGO 

RS1S9(PJ3) Normal^*-" CrTOATACCAACCTCOCCA^ 

RS190^.U> Mutant TOGGCAGGTrOGCATCAAO 

RS191 (1) Uutsnt fi*^ TOOGCAOATTOOrrATCAAG 

RS192H) fiom^fi^^ OCATAOACTCACXXTCUAO 

R5t«3 (4) Mututt CnCAG(2AT0AGTCTAT0G 

RS20i O) Nomat GCAOAATOOTAGCTaOATT 

RS2QZ (3) Mtttant ^'^^ GCAtUATOOTACCTCG ATT 

ltS196(4) Konnal^ ACTCCTTOAGOAOAAOrCTO^ 

RS197 (4) Mutant ^ OACTCCTGGOACAAGrCTO 

R51«(4) Mutant^ TGACTCCTQAOOAOOTCrO 



*Wbere applicable, the vahies to parentheses indicate the amount (pmd) of tailed oligooudeottde probe applied to the nykm membrane, 
tb. Biothi covaleatly attached to 5' end. 

^Tltese ^-globin oligonucleotide probes each span two sites of potential ^-lfaalassea^ mutations and are specific for normal sequences at both 
posHtoas. 
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OQA W 
DQA 1^4 
DQA 1^ 
DQA 2^4 
DQA t.1/2 
DQA VA 



Fio.2. DNAtypiag at Uie/rM-X>(M locos. Etch taOedofiigi)- 
Budeotide probe was spotted ooto 12 duplicate memtoBei, inadi- 
atcd at 40 taJ/ca?t hybridized with anptified DQA sequences in 
Cenoffltc DNA samples, and treated for color devdopmeitt The 
specifidty ofeach immobilized aiigosuckotide is given at the top. 
and the bi2Ageix>trpe <tf each sample is noted the The name, 
amoimi allied to the memfarane« spedficity, and seqaence of each 
oitgotuideotide are listed in Table L 

mining the optima! hybridization conditions for each member 
i^an initffiJ set of probes, then shortening or teogthening each 
<^igonucleotide until they aU hybridized under equivalent 
conditions.) These eight probes produce a unique hybridiza- 
tion pattern for each of the 21 possible DQA diploid combi* 
nations* 

Detection of ^TltatassemSa MntatioiB. Although ihere are 
>S4 characterized mutations <tf the fi-i^dtm gene that can 
give rise to /g^alassemia (25), each ethnic group in wliidi 
this disease is prevalent has a Umtted number 6( common 
mutations (4, 26, 27). In Mediterranean pofmlations, 8 imi- 
tations are responsftvle for >90% of the ^-thalassemia alleles 
(4). (Migonucleotides were synthesized that are spedftc for 
each of these B mutations as weS as their correH)opding 
normal sequences (Table 1). The oligonucleotides were given 



(dT>«)o tails with tenm'nal transferase and iqyplied to mem- 
taanes. Since the ^-thalassemia mmations are distributed 
throughout the 0-globtn gene, btotinylated PCR primers that 
amplify the entire gene in a single IT^Vbaso-pair fragment 
were used. (t\us amplification product encompasses all 
known ^thalassemia mutations, not only the iNredomiimnt 
Mediterranean mutations examined here.) After hybridiza- 
tion and color development, the ^obin genotypes coukf be 
determined by noting the pattern of hybridization (Fig. 3). 

Unlike the DQA typing system, two digonudeotide i^bes 
are needed to analyze each mutattori— one spedOe for the 
norma] sequence and one specific for the mutant sequence— 
in order to differentiate normal/mutant heterozygous carriers 
finom mutant/mutant homozygotes. A complicating fkctor in 
this analysis is caused by a^iiarettt secondary structure in 
various portions of the relatively long 0-globin amplification 
product that interferes with oiigonucleotide hybridization. 
The relatively high stringency needed to minimize this sec- 
ondary structure requires the use of longer (e.g., 19-base) 
oligonucleotide profcies. Because this constraim would not 
permit varying the length of the oUgonucIeotides to compen- 
sate for different hybridization efficiencies^ the **bataoctQg** 
of signal intensities was accomplished by aiQusting the 
amount of each oUgorracleotide c^lied to the membrane. 
This was done by ^plying various amounts of each oUgo- 
nucleottde onto a membrane and then, after hybridization and 
color devdosunem, simply selecting the positive spots that 
had similar intensity. 

DISCUSSION 

These studies have demonstrated the fea^ility of immol»> 
lizirv sequence-specific probes onto nylon membranes and 
hybridizhig PCR-ampUfled, biotin-labeled genomic frag- 
ments to the filters to determine the genetic content of the 
DNA sample. We have app^ this method to HLA-DQA 
genotyptng and to the detection of ^thalassemia mutattons. 
Alth<^gb the number of probes used in the two tests were 
modest ^ for DQA and 14 for ^thalassemia), expanding the 
analyses to aidude even more oligonucleotides should not be 
difficult. 

The recently described technique of simultaneous ampli- 
fication of severe DKA fragments, **multq>lex** PCR (28). 
should readily pennit the concuireot analysis <^ multiple 
genetic loci. Using the Immobilized-iffobe fcmnat, we have 
been able to simuHaiieously amplify and type at three lod: the 
HMin polymorphism in the <Vglotnn gene (29), the Ava U 
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Flo. 3. Detection ^thalassemia 
mutations. Various amounts of each 
tailed digonacleottde probe were applied 
to 12 duplicate nylon filten.irradi^ed at 
40 mJ/cm^, hybridized with ampKfted 
^-gl^n sequences in genomic DNA 
sanQ4es, and treated for cdor deveU^ 
ment. Ihe ^-ttaaJassemia loctis that is 
detected by each immobilized o%»mide- 
otide pair is given at the top of tl^ fitters. 
For each filter, the upper row contains 
the oUtgMittcleotide ^axibes that are spe- 
cific for the normal sequence and the 
lower row contains the oUgonucIeotides 
specific for the mutant sequeaces. The 
^^oUn genotype of each sample is 
noted at the ri^ The name, amoum 
applied to the membrane, specificity, and 
sequence of each oligonucleotide are 
listed in TaUe 1. IVS, httervcmng se- 
quence (tntron). 
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polymorphism in the low density Itpoprotetn receptor gene 
(30), and the HLA-DQA gene (R.K.S.* unpubKshed obser- 
vations}. Other genetic targets whose analysis would be 
simplifted by this technique include the detection of somatic 
mutaUwks In the HAS g^nes, where 6 lod and 66 possible 
alleles occur (31), some of the HLA class II ^-chain gcnes» 
where as many as 25 alleles can be detected CT. Bugawan, S. 
Schaifp and H.A.E, unpublished obsenrations), and 0- 
thalassemia in Middle Eastern populations, where in addition 
to the eodogeoous mutations, Mediterranean and Asian In- 
dian mutations are present at signifkant frequ^des (H. 
Kazazian, personal comnumtcatton). This format should also 
prove useful for the detection of infectious pathogens or for 
environmental surveys of microorganisms by immobilizing a 
panel of spectes-spedfic |Mx>bes. 

The ability to label probes and detect their hybridization 
without radioactivity is a conveiuent feature of PCR-based 
DNA tests and* perhaps more importantly, makes this type 
of analysis feasible in areas where radioactive labettog re* 
agents are diflUcuIt to obtain. In this report, a biotin tag was 
introduced into the PGR products by means of 5'*btotinyIated 
primers. An alternative Ubellng strategy based on the incor* 
poration of biotinylated dUTP (32) has also been tried and 
shown to be very effective (R.K.S.^ unpublished observa- 
tions). 

One of the prerequisites of this analytical method is that all 
of the bound oligonucleotides must be sequraco-specific 
under the same hybridization conditions. If necessary, this 
requirement can prc^ably be met either by adjusting the 
length, position, and strand specificity of the probes, as was 
done for the HLA-DQA assay, {h* by varying the amount 
applied to the membraite, as was done for the ^-thalassemia 
assay. The presence of tetramethytaounonium chloride in the 
hybridization buffer can also serve to minimize the differ- 
ences among inimobilized oUgonudeotides caused by vary- 
ing base compositions (ref. 33; T* Bugawan, personal com* 
munication). 

Although it may entail some initial effort, the end result is 
asimple, robust, and potentially automalable system that can 
be comi^eted (ampHHcation, hybridization, and color devel- 
opment) in 3*4 hr. "Reverse dot Mots" should be paiticu- 
tarly valuable for assays where the number of potential 
sequence variations exceeds the number of san^les to be 
tested. Bven in situations where the ntimber (^samples and 
probes are approximately equal, the immotHlized-probe for- 
mat may t)e preferabte since many fitters can be prepared at 
one time and stwed until needed. To date, this typ^ syston 
has been used to determine the HLA-DQA genciype 6i >3(K) 
unknown samples in forensic and disease-sosceptibSjty stud- 
ks. 

We dwnk R. Higuclu and S. Scharf for helplhl suggestions, L. 
Qoda, D. Spasic, and C-A. Chang fof symhesis of oUgntuckotides, 
C. Perez for advice oo teraimal transfcmse taifiog reactioa, C. 
Dowfing aod H. Kazaxtaa Uolim Hofduns) for sequences of jS-globin 
PCR primers and ^-thalassemia genonac DNA samples, S. Warren 
Bad J. Fladlay (Eastman Kodak) for red teueo dye suspenuon, and 
T. White, D. Gelfaxid, szmS H. Kazazfam for critical review of the 
manuscript. 
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ABSTRACT A rapid nonradloadiv* aj^ifxttcli to the <U- 
•gnosis of ddde ceS anemia b described based.on an aDele- 
^pedfic pdymerase cfaata reaction (ASPCR). TUs method 
allows direct detection of the nonnal or the sickle ceQ^-globis 
a&eic Id g»omlc DNA sfltboBt addldonal st^ of probe 
hybzldization» UsatloDi or rtstrictlon toMjm dcaragc* Two 
aUde-^edfic ofigomxdeotide primm, one ipedSc for the 
sidclcceaaDele and one specific f<v the normal aQcie, together 
with mother primer compiemeotary to both aDdei were used 
fall Che polymerase chahi reactlcm wtth gesorak DNA tem^ates. 
The aDde-flpedflc primen differed Ikom each other hi their 
ttrmhuy 3' niKleotUe. Under the proper anitfaHng taapera- 
tore and poiyntmse cbate reactloD c<aditio&s, these prhnm 
only directed ampfification on their complouotary alkte. In a 
single blfaid studly <tf DNA samptes from 12 faufividttals» thb 
method correctly and mumbigimadty aDorwed for &edeterml- 
natioo of the genotypes wtlh so tabe negatives or posttim. If 
ASPCR b able to dtolndnate aO aifefie noiatlon (both 
transition and transversioo rnntattons), this method has the 
potential to be a powcrfid ai^roach for genetic disease dhig- 
nosb, carrier screcalng, HLA typing, homan gene mapping, 
forooics, end paternity testing. 



StcUe ceU anemia b the prototype of agenecic disease csused 
by a sio^e base-pair mutatios, an A— » T transverston in the 
sequence encoding codoo 6 of the human ^-^bin gent. In 
homozygous sickle ceQ anemia, the substitution of a single 
amino add (Olu Val) in the ^-^obin subunlt of bemoglobb 
results . in a reduced soluUlity <tf the dcpxyhemiDglobin 
mt^ecule and erythrocytes assume Irregular shapes. The 
sickled erythrocytes beooine tn^ped in the microcifculation 
and cause damage to inultipie <»gans. 

Kan and Doty (1) were the first to itescribe the dtagnous 
of stcUe cell anemia in the DNA of affected imUviduab based 
on the link^ of the sickle ceD allele to an Hpa I restriction 
fragment length polymcnpfalsm. Later, k was shown that the 
mutation itsdf affected the cleavage she of both Z>d^ I and 
Mst n and couM be detected directly by restriction enzyme 
deavage (2, 3). Conner et al (4) described a more general 
approach to the direct detection of single nudettfide variation 

the use <^allele-specific oligonudeotide hybridszati<m. In 
this method, a short synthetic ohgonudeottde probe specific 
for one allele only hybridizes to that allde and not to others 
under appn^iriate conditloos. 

All of the above approaches are technically chaSenglttg, 
require a reasonably large amount of DNA, and are not very 
ra^. The polymerase chain reaction (PCR) devdoped by 
Saiki et a!. C5) provided a method to raptdty amplify small 
amouats of a particular target DNA. Tht amplified DNA 
could then be readily analyzed for the presence of DNA 
sequence variation <e.g., the sickle cdl mutation) by allele- 
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specific oligonudeotide hybridization (6). restriction enzyme 
cleavage (5, 7), UgatiOD of oUgonudcotide pairs (8, 9), or 
Ugation amplification (10). PCR increased the speed of 
analysis aiid reduced the amount of DNA required for it but 
did not cban^ the method of analysis of DNA sequence 
variatioa. In this paper, we investigated whether PCR could 
be done in an allele-specific manner such that the presence or 
absence dm amj^ed fragment provides direct determina- 
tion of genotype. 

PCR utUcEes two oUgonudeotide primers that hyl>ri(fize to 
opposmg strands of DNA at positions spanning a sequence of 
interest. A DNA polymerase (either the IQenow fragment of 
Escherichia coH DNA pc^ymerase I (5) or Therrruis aquaticuM 
DNA polymerase (11)] is used for sequential rounds of 
template-dependent syi^ds of tiie DNA sequence. Prior to 
the faiitiatioo (tf each new round, the DNA Is denatured and 
fresh enzyme is added in the case of the £. coff eikzyine. In 
this manner, exponential amplification of the target se^ 
qneaces is achieved. We reasoned that if the 3' mideotide of 
one of the primers formed a nusnuilched base pair with the 
teniplate due to the existence of single nudeottde variation, 
ampUficattoo would take place with reduced efficiency. 
. Specific primers would then direct amplification only from 
their homologous allde. After multiple rouiKls d aroplifica- 
tioo, the formation of an axiii^tfied fragment would indicate 
the presence of the allde in the initial DNA. 

MATERIALS AND METHODS 

OUgoniKieotlde Synthesis, ^igonudeotides were synthe- 
sized on an Applied Biosystems 380B DNA synthesizer by 
die phosphofamidlte method. They were purified by electro- 
phOTcsis on a urea/polyacrylamide gd followed by high- 
peifonnance liquid dinHn^ogmiAy as.describcd (1^. 

Smirce and Isdatfam of Honun DNA. AD genomic DNA 
sanqsles with the exception of the ^thalassemia DNA were 
isolated from the peripheral blood <^ appn^riate Individuals. 
The ^obin genotype of these individuals was prevtoudy 
detennhied by hybridization with allele^pedfie oUgonude- 
otide {tfobes (4) as well as by hemoglobin electrophoresis. 
Thalassemia nuuor DNA was obtained from an Epsteiih^arr 
virus-transformed lymi^ocyte cell tine obtained from the 
National Institute of General Medical Sciences Human Ge- 
netic Mutant Cell ReposHcvy (Camden, NJ). Thalassemia 
DNA was is<^ated from the ctiltured cdls. KEi DNA prepa- 
rations were performed accorxling to a modified Triton X-lOO 
procedure fdlowed by proteinase K andRNase A treatment 
(13). The average yidd i^enomic DNA was '•25 ^ per ml 
of Mood. 

PCR. H^14A (S'-CACCTOACmXTQA) and BGP2 (5'* 
AATAGACCAATAGGCAGAG) at a cooceotration of 0.12 
/iM were used as the primer set for the amj^catton of the 



Abbreviations: PCR« pdymerase chato reactioi: ASPCR. allde* 
spedfbFCR. 
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normal ^-gk>bin gene (a primer set). Similariy, 0.12 iM 
H^4S (5'-CACCTGACTCCTGT) and 0.12 mM BGP2 were 
used as the primer set for the amplification ci the sickle ccO 
gette (5 primer set). Both prim^ seU directed the amplifica- 
\xm a 203*ba$c-|>air (bp) ^globin aOefe-spcdnc rragmcnt. 
As an intemtd positive control, all reaction mixtures cod* 
taifted an additional primer set for the human growth hor- 
mone ^ne comprised of 0.2 /iM GHPCRl (5'- TTCCCAAO 
CATTCCCTTA) and 0.2 GHPCR2 (S'^KSATTrCTGT- 
TCTGTTrC) {hOH primer set). OHPCRl and GHPCR2 
direct the ampUfication of a 422-bp fragment of the human 
growth hormone gene. AO reactions were performed in a vol 
of SO/U containing 50 mM KCX, 10 mM Tris-HO (pH 8.3), 1.5 
mM MgCl}, 0.01% (wt/vol) gelatin, template DNA (0.5 
;ig/mO. and 0.1 mM each dATP, dCTP, dGTP, and 7TP. 
Reactions were carried out for 25 cycles at ah annealing 
temperature of 55*C for 2 min, a polymerization temperature 
of ire for B min. and a beat-denaturation temperature ^. 
94*<: for 1 min on a Perkin-£lmer Cents DNA thermal cycler. 
At the end of the 25 rounds, the samples were held at 4°C in 
the them^ cycler until removed for analysis. 

Analysis (tf the PCR Prodocta. An aliquot (15 /d) from each 
qS the completed PGR reactions was mixed with 5 of 5x 
FicoU loading buffer (ix » 10 mM IVis Ha, pH 7.5/1 mM 
EDTA/0.05% bromopbenol bhie/0.05% xylene cyanol/3% 
FtcolO and sulyected to electrophoresis in a 1 .5% agarose gel. 
Electrophoresis was peiformed in 89 mM Tris-Ha/89 mM 
borate/2 mM EDTA buffer for 2 far at 120 V. At the 
completion of electrophoresis, the gel was stained in ethidtum 
bromide (1.0 #ig/m}) for 15 min. destained in water for 10 min. 
and photogr^cd by ultraviolet trans-illumination. 

RESULTS 

EiperimeitfBl Dcdgn. The scheme describing allele-spccinc 
?CR (ASPCIQ is shown in Fig. 1. Primer PI is designed such 
thai it Is con^lementary to allele 1 but the S'-terminal 
nucleotide forms a single base-pair mismatch with the DNA 
sequence of allele 2 (Fig. US, *). Under ^^ropriate annealing 
temperature and PGR conditions, there is nonnal ampliflca- 
tion of the P1-P3 fragment with DNA templates containiog 
aOele 1 (homo- or heterozygous), whUe there b little or no 
amplification from DNA ten^lates containing aOde 2. In a 
sindUr way, a primer (P2) could be designed that would allow 
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FIG, 1. Schematic r^rescntatiOD of the /iSPCR. PI and P3. 
synthedc oligonucleotide primera thai anneal to opposing strands of 
a sinBle copy gene. PI aaoeals to Iha r«gioa of a gene in the r^ion 
of a DNA sequence variation such that itstenmnalS' nucleotide base 
pain with the potyoKMi^ nucleotide of the template. PI b com* 
ptetely complementaty to altele 1 (A) but fmtt a stagle baae-pair 
mismatch with allele2 ai the 3'-temt2naI position due to one or more 
nucleotide differences relative to allele liffU . 



the specific PGR amplification of allele 2 but not allde 1 
DNA. 

We designed two 14-nucleottde-long allele-specific prim- 
ers, H^14S and H^14A. complementary to the 5' end oil the 
sickle cell and normal ^lobin genes, respectively. The 
oligonucleotide primers difTer from each ottier by a single 
nucleotide at the 3' end. H^14S having a 3' T and H^14A 
having a 3' A corresponding to the base pair aflected by the 
sickle cell mutation. The oligonucleotide pnmer BGP7 (7) 
complementary to the opposite strand 3' of the allele-specific 
j^mera was used as the second primer for PCR. The 
amplification product with these primer pairs was 203 bp. 
Also induded in each reacttpo was a second pair of primers 
that directed the amplification of a 422-bp fragment of the 
human growth hormone gene. These primers were tnchided 
as an internal positive control. 

Discrlninatiott Between the Nomal and Slckk CcU Alleges. 
Genomic DNA was isoteted from peripheral blood leuko- 
cytes of individuals of known /9-globin genotypes (fi^/fi^^ 
^/pP, t^if^)' In addition, DNA was isolated from an 
^stein-BaiT virus-transformed cell line containing a ho- 
mozygous deletion of the ^-globin gene.O^/^). DNA was 
sulijected to 25 rounds of PCR using either the sickle 
cell-^ccific primer set (H014S and BGP2) or the nonnal 
gene-specific primer set (H^14A and BGP2) using an anneal- 
ing temperature of SS*x:. The reudts are shown in Ftg. 2A. It 
can be seen that a 203-bp fragment is c^served using the 
sickte ceU*spectfic primer set only with the ^^l§^ and ^//^ 
genomic DNA templates and not with the fi^lp^ genomic 
DNA templates. Conversely, the nonnal gene-specific prim- 
er set only gave rise to an amplification product with 
and fi^/^ genomic DNA templates. As expected, the 
thalassemia DNA did not give rise to a ^-^obin gene 
ami^ification product with either primer set. The interna! 
growth boimone gene control gave rise to a 422-bp fragment 
in all samples, demonstrating that in no case was the al^ncc 
of a globin-specific band due to a failurie of the PCR. 

In a sin^e blind study* the DNA torn 12 individuals with 
different 0-gtobin genotypes was ax^yzed with the two 
primer sets. TTve results are shown in Fig. 2B, individuals 1, 
2. 3, ami 5 are predicted to be ^^/^; imtividuals 6, 9. 10, and 
11 are predicted to be and individuals 4, 7, 8, and 12 

are imlicted to be fi^/p^> In each case, the gettotype was 
correctly and unambigaously predicted from the pattern of 
fragment ampltfscation (see leg^ to Fig. 2 for clinically 
diagnosed genotype). 

DISCUSSION 

The results presented above indicate the potential usefulness 
of A5PCR for sickle cell diagnosis. The method is ra^d and 
the result is <^tained without the use of radic»€tivity, since 
all that is required Is to visualize the band on a gel with 
ethtdium bromkle staiiung. It should be possible to further 
improve the technique by elimination of the gel separation 
step. One strata for this is shown in Fig. 3. As pn^>osed 
recently by Yamane et al, (15), the two primers for the PCR 
could be labeled differently, one with biotin and one widi a 
fluorescent group such as fluorescein or tetramethyl rhoda- 
mine. The produa of the PCR could t>e captured on strepta- 
vidin-a^uose and the presence of the ampiifM sequence 
coukt be detected with the fiuorescence. In this case, if one 
allele-speciric primer were labeled with one fhtmscent group 
and the other were labeled with a different one, then tht 
ASPCR couM be done simultaneously. 

In thb study, we have used fCR primers that form either 
an A*A or a T'T mismatch. It is not clear that other 
mismatches will give equally effective discrimination. Since 
G*T mismatches are more stable than other mtematches (1^, 
G-T shcMild probably be avoided when designing primers. 
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Ro. 2. (4) IdentificattOD of the oonnal O^) and the stdck ceQ 
ifih ancles )>y ASPCR^ Nonnal ifi^/fh. bomozygous sickte ceU 
{fir/f^, facterazyjmas sickle ccH (fi^/^, and horoozy^s /9- 
thaiassemb {ffi^/fr) DNA san^les (0.5 /ig each) served as temptioe 
ustilg ettiit? the DonnA! {a primer set) or the slcUe ceU U pniHcr set) 
fpr the ASPCRs. As an intettal positnre coi»y^« all ftactioo 
eontatoed aa additiooai ptuoctt set fw- the human growth bwrnoiae 
gene {HGH pfloier set) that directed the amp&ficatloQ <tf a 422-bp 
fisgment of the human gmwthhcKiiroiM gene.. After ami^^ 15 
/tl 6om each reactkm mixture was sut^ected to electrot^iofesis in a 
1.5% agarose get for 2 hr at 12D V. Elhidimn bromide staining of the 
agtfose gel was lised to detect PGR amplified fragments. I^M^ye 
fi-^io^ ASPCR can be'tdentiiSed by the presence of a 203-bp 
fragment using either the a or the « pfkaer set rcactioa. As a martur 
for the gtobin-apedfic fragment. 0.3 ft% of j^asmid pH^^ containing 
the narmal hdman globm geoe (3^) waiampSSed viththe a primer 
set alone Oil). As a o^er for the growth honn(we-^)ectfic 
fragment. 0.1 mE of plasmid pX0H5 containing a 3.8-kflobase 
fri^meot thehtttnanp^o^hcmooDe gcite(14)was ampi^^ 
the growth hormone pinner set (hOH) aknse (Iif2}. (fi> A single blind 
trial usfaig ASPCk to diagnose the ^-glo^ gemMype of gepmnic 
DN A no^s. Oeoofl^ DH A san^les from 12 indhridaab (4 each 
of oonnal. homozygous, and heterozygous sickle cell individttals) 
were fandoddy assigned numbers 1-12 by the heihatology labotasory 
and blinded to the ihvestigaton. ASPCR was p er formed using both 
the npimal (o) and the sickle cell-speaftc (r) priiner sets as descnl^ 
above. Genotypes were identified as homozygous non^ 
tfaf sti^ 20i-b!P fragmcm i^^ears exchmvety in the « primer set 
reaction, as hMnozygous sickle oefi {fi^/fi^ if the 203-bp fragment 
amari only in the / primer set, or as heterozygous sickle cell trait 
(fr/fi^ if the fragment appears in both reactions. The genotypes of 
these DHA sam^ wete previously detennined by hemogWn 
electropborests (results ttot shown), Ibe genotypes of the 12 indi- 
viduals are as follows: h 2. 3, and 5. 6, 9. 10, and U, 
4, 7, 8, and 12. ^«/^, 

This can be done by designiog the piimcr so that it is 
Goaq»leme&tary to the strand mih wlidch it fonns an A*C 
ousaiatch. It may be possit>le to.use a competition apjaoach, 
as we have previously tised to troprdye the discraunatkm 
provided by oligonudec^ide hybridization fsxjbcs (17). In this 
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Fto. 3. Schematic representation €i a dual labeling lystem. 
suitable for the detection of the ASPCR prpdncts. Orte of the 
oBgnnucleotide prsners is bbeled at the 5' end with a fhioresceni 
group such 9S fluorescein or tetramethyl rhodamine (L) and the other 
piioer is labeled with btotio (B). The ASPCR emplific^n product 
w6uld th^Mt have the 5^ end tabetedon t)oUi strands, the Uotin 
is suitable for capturing the amplified fragnicot on a streptavtdixi- 
agarose cohunn. while the fli^>rescent group is suitable for measuring 
the amount of ffBgment protiuced. 

case, a competitive primer coidd be designed that was oot 
able tp pri0ie» far exuople, by inchidnig in it a 3' dideoxy? 
nucleotide or a 3' riboniideoUde that has been oxidized. A 
mxcm of a labeled edlele-sp^ctftc primer coinjpilciiientary to 
allele 1 pthis an uqlabeied primutg-defective primer comi^e- 
fhentary to aOde 2 should then allow the specie amplifi^ 
turn dr allele 1. 

The ability of an oligonucleotide to prime on a DNA 
template is governed by two kinetic vaiiables: the. rate at 
whtoh the annealed primer dissociates ftom th^ ten9>late 
befm initialing polymerization irtoi and the rate at which the 
DNA polymerase extends tiie pirimer.(rp^. Efficient priming 
in FCR should take place whenever > fotf, the addition <^ 
the first few nucleotides to the primer then greatly ^tabUtzing 
the oUgonudeotido-tempIate complex and allowing contin^ 
ued extension of the pisner. For a give9 pruner ts an 
intrinsic property of the polymerase. Studs^ with E* coti 
DNA polymerase 1 have stig^ested that this polymerase may 
be able to dis^iminate between primers that either do or do 
not form a misoiatch with the template at the 3'-terminal 
Qucleottde (18). In this case» /hi fo^ the -mismatched primer 
was slower tiiw rpoi for the* perwctly matched primer. For the 
present study, we designed the aBele-spectfic primers su^ 
that the allele-specific nucleotsde in the template was com- 
plemestaiy to i£c 3'-tenmnal niKleotide of the primer! In this 
way, the 3' nucleotide of the primer specific for one allele 
w<^uld form a snisma^ with the other aUele. This des^ 
aUows one to take a4vantage of the difference betweeii f pel of 
the pofcctty matched and mismatcted primers a^ well as to 
oi^iimze primer concentration, {nrimii^ tetnperature, primer 
length, and primer sequence, all <^ which will affect die 
difference in the r<tf for the two allele-specific primers. 

Wc reasoned thal a.9et of conditions should exist such that 
rptit > ftiB^of tbe perfectly matched primer, ^^e < 
for tbe misdiatched jmmer. The results shown here dearly 
demotistiate this to be tme. In our $tttdy. the allelerspecific 
jffimers were 14 nucleotides long. We found (dat% not shown) 
that discrimination between the fi^ and alleies was not 
possSile at low annealing temperatures (e.g., 44X^ and 50^. 
Presumably the short lengtb of the otijetmucleotides as well as 
the Iftgh annealing tempciratu're combined to provide the 
dtscrimbation. 

Tag polymerase is well suited fm- using ASPCR for the 
dtscriminatiiDn of two afleles that differ by a sln^e nucleotide 
because it Ifldcs a 3' -r^ 5' exonudease activity (19). Such an 
activity woidd correct the misn&atched base pair in the 
mismatched primeMem^ate com|^ an^ then permit effi- 
cient priming with the onernucleotide-shtxter primer. Since 
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the specificity of the ASPCR is determined in the initial 
seveml cydes of PCK, the fact that the primer remains 
uacoTTeet«d enhances the discrimination of the reaction. 
PGR is an exponential reaction; the yield of product is very 
dependent on the efficiency of each round (S), Ooty very 
minor changes in the elficieiicy of each round of amplification 
have profound effects on the overall yield after toumy rounds. 
For example* if the efficiency of ^ reaction with thfc 
perfectly matched primer is 909^ and with the mismatched 
primer is 60%» there would be 73-fold more product produced 
in the reaction with perfectly matched primer than with the 
mismatched primer. 

The ASPCR should find ai^ficatibn in che fitids d genetic 
diagnosis, carrier screening, HLA typing, and any other 
nucleic acid*based dtagriostic iii whids the precise DNA 
sequence of the priming site b diagnostic for the target. In the 
case of HLA typii^. recent advances have used PGR ainpli- 
ftcation followed by allele-spedfic otigomideatide hybridiza- 
tion for the determination of DR, DQ^ and DP alleles (6» 20- 
22). It shQuld be pos^ble to use ASPCR for the direct analysis 
of HLA types. 

We have recendy proposed a process for the simidtaneous 
determination of multiple polymorphic loci based on the 
concept of producing locus-specific ampliflcatton products 
each with a unique length (23). In such a system, since 
ASPCR would i^uce aUele-specific products, the simulta- 
ncoas analysis of the genotype of the target DNA at multiple 
loci should be possible. 

This work was supponed by Grant DCB-R3 15365 from the Na- 
tional Sdence Foundatioo (R.B.W.>. D.Y.W. h a M.O./PhI>. 
candidate at Loma Lbda Univereity. R.B.W. is a member of the 
Cancer Center of the Qty of Hope (NDl CA3357Z). L.U. is a fellow 
ef AIRC (Associaziooe Ita&ana per ta Ricerca sid Cancro). 
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ABSTRACT DNA diagnostics, the detection of specific 
DNA sequences* wO) play aa increasingly Important role in 
medkiBe as the mdecular basb of human disease b defined. 
Here, we demonstrate an automated, nonisob^ strategy for 
DNA diagnostics using amplificatlcMi of tvget DNA segments 
by the polymerase chain reaction (PGR) and thedlscrimtnation 
of allciic sequence variants by a colorioietric oligonucieollde 
ligation assay (OLA). We have applied the automated 
PCR/OLA procedure to diae" * »<l^ of common genetic dfaetwcs, 
such as sidUe cell anemia and cystic fihrosls (AF508 mutation), 
and to genetic linkage mapping of gene segments in tiic human 
T<eU receptor ^hain tocus. The automated PCR/OLA 
Strategy provides a rapid system for diagnosis of gen^, 
anaiignant, and infcctlotu diseases as weD as a powerfti! ap- 
proach to genetic Ilnluge mailing of diromosozBes and foreo* 
Sic DNA tyidng. 

The study of DNA sequence variants in humans is playing an 
important role in diagnosis of genetic and malignant diseases 
(1.2). The analysis of DNA potymorphisms also serves as the 
ftindamentai tool in attempts to construct genetic linkage 
maps (3. 4) and in forensic analyses (5, 6). Since the majority 
of DNA sequence variants and polymorphisms are single 
nucleotide substitutions (1, 2). diagnostic techniques must 
accurately discriminate single base changes. 

Single base variations in DNA sequences can be detected 
by a variety of techniques including Southern blot analysis (7) 
for restriction fragment length polymorphisms, allele-specilic 
oligonucleotide hybridization (8), denaturixig gradient gd 
etectrophoresis (9)i chemical cleavage of roismtched het- 
eroduplexes (10), conformational changes in single strands 
(11), and allele-specific priming of the polymerase chain 
reaction (PGR) (12-14). These techniques have several dis- 
advantages for automating DNA diagnosis, which include the 
use of r^ioactivity, the requirement for various hybridiza- 
tion conditions, and the ne^ for electrophoresis or centrif- 
ugation. 

The analysts of DNA sequence variants has been greatly 
facilitated by the development of r^d methods to exponen- 
tially amplify specific DNA or RNA targets. IKagnostic 
targets can be amplified by PCR (15-17) or by other available 
methods (18-21). Amplification generates specific targets 
with high signal/noise ratios and permits the use of less 
sensitive nonisotopic reporters in DNA analysis. 

An alternative strategy for DNA diagnosis, the oligonu- 
cleotide ligation assay (OLA), employs two adjacent oligo- 
nucleotides (2(>-mers). a 5' biotinytated probe (with its V end 
at the nucleotide to be assayed) and a 3' reporter probe 
(22-24). The two oligonucleotides are hybridized to target 
DNA and, if there is perfect complementarity, the enzyme 

The publication costs of this article weie deHrayed in pan by pa^ charge 
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in accordance with 18 U.S.C. 11734 solely to indicate this fact 



DNA ligase covaiently Joins the 5' biotinylated probe and the 
3' reporter probe. If the probes and target are mismatched at 
their junction, a covalent bond is not formed. Capture of the 
5' Uotinylated probe on immobilized streptavidin and anal- 
ysis for covaiently linked 3' reporters determine the nature of 
the probe-target interaction (matched or mismatched). The 
ligase assay uses a standard set of conditions to distingtiish all 
nucleotide mismatches, and product analysis does sot re- 
quire electrophoresis or centrifugation (22). In this report, wc 
describe a strategy for automating DNA diagnosis that com- 
bines target ami^Tication by PCR with a nonisotopic analysis 
of DNA sequence variants by OLA. 

MATERIALS AND METHODS 

Robotic Worfcstatl<m. A Biomek 1000 workstaUon (Beck- 
man) equipped with multipipet tools and a multlbulk tool was 
used to perfcmn all pipetting, aspirating, and washing pro- 
cedures. The workstation has been modified with a solenoid 
to switch wash solutions during the ELISA. AH reagents for 
sample processing were stored in sterile S^minitube cas* 
settes. 

DNA Samples. DNA from humans with ai-antttrypsin. 
^-globin. and cystic fibrosis variants was obtained finom F. 
Heytmancik (BaykH* University), from K. Tanaka (Harbor 
Hospital) and J. Korenberg (Cedar-Sinai Hospital), atui from 
A. Osher and E. Hsu (Children*s Hospital), respectively, and 
prepared as described (22). DNA for amplification of human 
T-cell receptor ^hain (TCR^ gene segments was obtained 
by gently scraping cells from the lining of the buccal cavity 
with a sterile toothpick. Buccal cells were dislodged into a 
minitube containing 10 ^ of sterile HjO, covered with 75 
of mineral oil, and placed into a 96-minitubc cassette for 
handling by the robotic workstation. Cells were lysed with 20 
Ml of 0.1 M KOH and 0.1% Triton X-100 at 6rc for 20 min 
and neutraiized with 20 fi\ of 0.1 M HCl and 0.1% Triton 
X-100. 

digonndeotldes. Amj^cation primers and ligation prot>es 
were assembled by usmg standard pfaosphoramidite chonistry 
on an Applied Biosystems 38QA DNA synthesizer. Ligation 
probes were modified with a 5' biotin group as described (15) . 
or chemically phosphorylated with S* Phosphate-ON (Clon- 
tech) Bccordmg to the manufacturer's directions. Modified 
F^obes were purified by reverse-phase lugb-perfonnance h'q- 
uid chromatography. Phosphorated oU^t^leotide probes 
(500 pmol) were labeled with dUTP-di^xigenin by mixing 100 
mM potassium cacodylate, 2 mM CoCh, 200 ftM dithio- 

Abbrcviatioos: PCR. polymerase chain reactioo; OLA , oUg oaucte- 
otide ligation assay; TCR^, T-cell receptor fi chain; CFTR, cystic 
fibrosis transmembrane conductance regulator; V« variable; D, di- 
versity; J, joiaing; C, constant; STS. sequence-tagged site. 
*To whom reprint requests should be addressed. 
^Current address: Department of Medical Genetics, Univenity of 
Upsala. Box 599. Biomedical Center. S-751 23 Upsa^, Sweden. 
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threitol, 2.5 ^1 of dUTP-digoxisenin (Boehringer Mannheim), 
am) 2 /i) of adenosine tripho^iate (40 fM) with 70 units of 
terminal dcoxynuclcotidyliransferasc (CoUaboralivc Re- 
search) for 1 hr at VTC. Free dUTP-digoxigenin was removed 
by two successive ethanol precipitations. 

DNA AmpKficatloD. The robotic workstation was pro- 
grafiin>ed to assemble PCK reagents [5 t»X contaioir^ 20 mM 
Tns-HCl (pH 8.3), 100 mM KO. 3 mM MgCb. 20 ng of bovine 
serum albumin per ml» the four deoxynucleotidc triphos- 
phates each at 400 iM^ 0.5 iM ampUfication primers, 0.1% 
Triton X-100, and 0.05 unit of Themus aquatkus DNA 
polymerase per well], genomic DNA (S >i] at 2 ngZ/O in sterile 
distilled H3O containing 0.1% Triton X-100). and 70 pd of tight 
mineral oil in a flexible U-bottomed 96-weU microtiter plate 
(Falcon). Genomic DNA samples were denatured at 93°C for 
4 min and amplified by 40 cycles of 93*C for 30 sec. 55'*C 
Ccystlc fibrosis transmembrane conductance regulator 
(CFTR) and TCRa constant (CJ gene segments] or 61X 
(^giobin and aj-antitrypsin gene segments) for 45 sec* and 
72°C for 90 sec in a microtiter plate thennal cycler (MJ 
Kesearch. Watertown, MA). For amplification of TCR^gene 
segments, 15 /tl of PCR rcngcnts (as described above) con- 
taining all six amplification primers, 15 ;a] of the lysed buccal 
samples t and 70 iiX of mineral oil were added to a flexible 
microtiter plate. Targets were denatured at 93X for4 min and 
amplified by 20 cycles of 30 sec at 93X:, 45 sec at 61*C. and 
90 sec at ITZ. Five microliters from these reaction mixtures 
was used to initiate a second round of amplification for each 
of the individual TCR^ gene segments (40 cycles; 30 sec at 
93*C, 45 sec a61X, and 90 sec at ITCf. 

Ligation Assays. Ligation reaction mixtures were assem- 
bled by the robotic workstation. Forty-five microliters of 0.25 
M NaOH containing 0.1% Triton X-100 was added to ampli- 
fied DNA samples. Ligation probes (200 fmol each) in 10 yX 
of 2x iigase buffer (100 mM Tris-HCi. pH 7.5/20 mM 
MgCl2/2 mM spermidine/2 mM adenosine tiiphosphatc/10 
mM ditbiotbreitol) and 50% formamide were added to a 
U-bottomed 96-weU microtiter plate. DNA samples were 
neutralized with 45 of 0*25 M HQ and six IO-/I1 altquots 
were added to the microtiter plate containing the ligation 
probes. Samples were covered with 70 ^ of mineral oil, 
denatured at 93'C for 2 min, cooled, and remmed to the 
workstation for the addition 5 pX oi T4 DNA Iigase (5 
units/ml) (Amcrsham) in Ix Iigase buffer. Ligations were 
done at room temperature (RT) for 15 min. Reactions were 
stopped by adding 10 of 0.25 M NaOH per well and, after 
2 min at RT, 4 fd of 3 M sodium acetate (pH 6.5) per well. 
Samples were transferred to a 96-weU flat-bottomed micro- 
titer plate (Fateon) coated with streptavidin (60 >ii of strepta- 
vidin (100Mg/ml)oravkiin (100/*g/ml) (Vector Laboratories) 
for 1 hr at 3T*C) and blo(^ed 20 min (RT) before use with 200 
Ml of 100 mM Tris HQ, pH 7.5/150 mM Naa/0.05% Tween 
20 (btrffer A) per wed with 0.5% dry milk and 100 of salmon 
sperm DNA per ml. Biotinylated probes were captured at RT 
for 30 min, and the plate was washed twice with 0.01 M 
NaOH and 0.05% Tween 20 and once with buffer A. Ihirty 
micrdsters <tf anti-digoxigenin antibodies (diluted 1:1000; 
Boefaringer Mannheim) in buffer A with 0.5% dry milk was 
added to each microtiter well. Plates were incubated 30 min 
(RT) and washed six times with buffer A. Substrate (30 jcil of 
BRL ELISA amplification system per well) was added, the 
plates were incubated 15 min (RT), and 30 fit of amplifier was 
added. Spectrophotometric absorbances were taken at 490 
nm by a Bio-Tek (Burlington. VT) plate reader and absor- 
bances were directly entered into an IBM*XT computer. 

Linkage AnaTysb. Observed baplotype frequencies were 
calculated for genetic linkage analysis of TCR^ gene seg- 
ments with a myriad haplotypc program (25). The probability 
of linkage disequilibrium was calculated based on the 
distribution of the Q statistic described by Hedrick et al. (26). 



RESULTS 

The Automated PCR/OLA Strategy. Our strategy for au- 
tomated gene analysis is shown in Fig. l. A Biomek 1000 
robotic workstation was used to (0 prepare targets and 
assemble reagents for DNA amplification, 07) mix and ligate 
5' biotinylated probes and 3' digoxigenin-labcled reporter 
probes on amplified DNA targets using T4 DNA Iigase, {tit) 
capture 5' biotinylated probes on streptavidin-coated micro- 
titer plates, iiv) wash plates, and (v) detect the digoxigenin 
reporter coupled to blotln-labeled probes by an ELfSA. 
Altogether, processing time for 96 samples from entry to 
computer read-out takes <7 hr. Overnight amplification 
permits processing of ligation assays from 192 DNA samples 
in a single day (1200 reactions, triplicates for two alleles). 

Amjdiflcatlon Primers and L^ation Probes. A panel of 
amplification primers and ligation probes for known se- 
quence variants in human DNA have been synthesized (Table 
1). Two sets of probes detea otutaUons that cause common 
genetic diseases in homozygous individuals, sickle cell ane- 
mia and CF (27, 28). Another set detects a common mutaUon 
in the <iri-antjtrypsin gene that, in homozygous individuals, 
leads to a predisposition for cirrhosis of the liver in childhood 
and emphysema in adults (29). The remaining probes detect 

1 Amplify Targat DNA 




2 Denttura. Anneal and Ugate Modified ongortudootidos 
on Anplified Target 




3 Oaptum eiotbrytated aigomideotidas and 
Perfomn ELISA for Digoxigenin 




Fig. 1. Schematic diagram of the steps in the automated PCR/ 
OLA procedure performed with a robotic workstation. The assay 
coataini three steps: 1, DNA Uuget amplificatioo; 2. analysis of 
taiget nucleotide sequences with biotin (B)4abeled and digoxigenin 
(DVUbeted oligonucleotide probei and T4 DNA Iigase (L); 3, capture 
of the Notin $)-labeIed probes on streptavidin (5A><oBted micro- 
titer wells and analysis for covalently linked digoxigenin (D) by using 
an EUSA procedure with alkaUne phosphatase (AP)-eonJug8ted 
anU-dlgoxigenin (aD) antibodies and a substrate (5). 



Tatrie 1. Nncicotide sequence of the amplification primers and ligation probes used in automattd DNA analysis 
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by 

ligation probes 



CAACrrCATCCACGTTCACCTTGCC 
AGGGCAGOAGCCAGGGCrCGG 
Ol-AntUrypsin TCAGCCTTACAACGTGTCTCTGCTT 
GTATGGCCTCTAAAAACATGOCCXX 
CAGTGGAAGAATCCCATTCTCTr 
CGCATCt iri OATCACCCTTCTO 
CCTTGAACCTCOGAGTCC 

gagctaagagacccgtactcg 
aaggoaaaggatotagac 
crggcacacaoatacacgocc 
aaggcaaaggatgtagac 
ctgccacaoagatacacxkkrc 

GAGTCACACAAACCCCAAAGCACCT 
CCTCCTGOCACAOAAATACAAAGCT 2, 
GATTATGOTCCnTCCCGO 1. 
AOCTCCACGTOGTCGCOCT 2, 



cm 

Co 



B-ATGGTGCACCTGACTCCTGA 

B-ATCGTCCACCTGACTCCTOT 

B-GGCTOTGCTOACCATCOACG 

MOCTOTGCTGACCATCGACA 

B-ATTAAAGAAAATATCATCTT 

B-ACCATTAAACAAAATATCAT 

B-GAAACGAAGAAACTGAGGCCA 

B^AAACGAAGAAACTGAGGCCC 

B-TTTACTGGTACCGACAGAGC 

B-ttTACTCGTACCGACAOAGC 

B-TYrTGCAGAGAGGACIGGGGG 

B-TCTGCACAGAGGACTCCKvGA 

B-AGGCCTCCAGTTCCTCATTCAG 

B-ACGCXTO:ACTTCCTCArrCAC 

B^CCAGGACCACACAGCTCTC 

B-ACCAOOACCAGACAOCTCIT 



pGCAGAACTCTGCCGlTACTO-D 

pAGAAACGCACTGAACCTCCTD 

pTtXmrnTCCrATOATGAAT-O 

pCACAGCTAATGAGTGAGCAAGAD 

pCTGGGGCAGGGCCTGGACTT-D 

pATCCGTCTCCACTCTGACOA-D 

pTATTATAATGGAGAAGACAGAGCA-D 

pAGACCAACCCTACCCCCATTAC-D 



l.^A 

1. M 

2. Z 

1. Non'F508 

2. AF508 

1. C«3A 

2. C^B 

1. V(^.71A 

2. V^.71& 
1- V^.72A 
2. V^.73B 
1. 

2. V^B 

1. Cfl^A 

2. C^B 



Ligation reactions were performed with a mixture of a biotin-labeted and reporter-labeled probe for each specific allele. 



polymorphisms in the htiman TCR^ and TCRa loci (refs. 30 
and 31; C. Whitehorst, P. CharmJcy, L.H.. and D.A.N., 
unpoblished data). Most of these probes detect single nucle- 
otide substitutions in a specific DNA tax^t. However, one 
set of probes detects a 3*base-pair (bp) deletion in the gene 
encoding CFTR (28) and represents a model for the detection 
of sequisnce deletions by OLA. 

Analysb of DNA Sequence Variaiits. As a model for DNA 
diagnosis by thePCR/OLA procedure* we obtained genomic 
DNAs from 32 individuals of known genotype. The robotic 
workstation was used to assemble PCR reagents and genomic 
DNA samples in a 96- well microtiter plate. After amplifica- 
tton. ligations were performed in triplicate for each allele, and 
the immobilized probes were analyzed for the presence of 
digoxigenin. An example of a microtiter plate obtained from 
this process is shown in Fig. 2. Amplified targets from 
homozygous and heterozygous individiials for the indicated 
nucleotide substitutions O-globtn, ai-antitrypsin« and TCR 
or deletion (CFTR) were used. The assay clearly iden- 
tifies which alleles 1 and/or 2 (Table 1) were present in each 
of the amplified samples (Fig. 2). Fig. 3 shows the mean 
absorbances obtained from ligation assays on ampliOed DNA 
targets from eight dilTerent individuals for each of the ana- 
lyzed genci segments (32 imtividuals altogether). Mean ab- 
sorbances ftova different individuals ranged from 0.38 to 1 .17. 
We have found that mean absorbances from the ligation 
assays reflect the amount of target present in an amplified 
DNA sample. In this regard, the colorimetric assay is quite 
sensitive and can detect 3 fnmJ of ligated product (data not 
shown). The high signal /noise ratios (10:1-200:1) obtained 
with this procedure also permit simple data processing to 
define the genotype <tf an amplified DNA santple by calcu- 



lating the ratio of the mean absorbancc for each allele in the 
ligation assay. Furthermore, since the outcome of the PCR/ 
OLA procedure is based on the mean absorbance ctf trlpHcate 
ligation reactions, the chaiice of error arising from spurious 
false-negative or false-positive wells is also minimized (false- 
negative or false-positive wells < 0.2% in 4000 reactions: data 
not shown). 

(knctk Uokage Analysis of TCR^ Genes. The automated 
PCR/OLA protocol has been extended to include the prc|>- 
aration of DNA samples by the robotic workstation. Ampli- 
fied DNA targets from htunan buccal samples werc^ used to 
determine the frequency and genetic linkage of four DNA 
sequence polymorphisms m the humaii TCR3 tocus as shown 
hk Fig. 4. The human TCR^ loctis is composed of several gene 
segntents, variable (V), diversity (D), end joining (J), and 
constant (C) genes » which span >600 kilobases (kb) of DNA 
(Fig. 4) (32, 33). Using data obtained from the automated 
PQ^OLA procedure on these 96 samples, we found that two 
polymorphisms were in cothpSete Unkage disequi)lt>- 
rium {F < 10"^^). This lindiiig was not surprising since these 
variants are separated by a small physical distance (100 bp). 
Although the exact location of the g^ne segment in the 
TCR^ locus is not known, analysis of available cosmid and 
YAC clones by gene-specific PCR suggests that V^J is 
probably located 5' to the V^l gene segment. The three TCR 
polymorphisms (V^.7, V^l, and Cp), physically spannir^ at 
least 600 kb^ appeared to be in linkage equilibrium with one 
another. Indeed, the expected haptotype tequencies calcu- 
lated asstmting linkage equilibrium were very close to those 
observed {P < 0.81) (Table 2). These findings confirm those 
recently reported in a study of TCR polymorphisms detected 
as restriction fragment lei^h pdymoiphisms and may sug- 
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Fto: 1, Amplified DNA targets obuuned fhrni genomic DNA samples were analyzed in triplicate by using the indicated combinations of 
ligation probes (alleles 1 and 2 as described in Table I) for each specified gene segment. Welts containing digoxigenin form a magenta-colored 
product and iiKlicatc complementarity between the ligation probes and amplified DNA target. 
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Tsbic 2. TCR hapiotypcs 
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Fic, 3. Mean spectropbotomelric absorbances (+1 SD) from 
triplicate Ugation reactions performed by the automated PCR/OLA 
procedure on amplified DNA samplei obtained from eight donors for 
each gene analyzed (32 DNA samples total)- 

gest that hot spots of recombination exist in the TCRfi locus 
041 

DISCUSSION 

Automated analysis of DNA polymorphisms and variants by 
PCR/OLA. has many advantages over existing approaches to 
DNA diagnostics. Small numbers of cells (cheek scraping) or 
DNA samples. (10 ng) are sufHctent for analysis. Only small 
fragments of DNA (a few hundred base pairs) are required. 
Therefore, partially d^raded DNA is still useful. The re- 
agents are stable and easily obtained, and nonisotopic re- 
porter groups are used. The entire assay is performed in 
tnicrotiter welis, thui avoiding the use of cfcntrifugadon or 
electrophoresis. The assay yields high stgnal/noise ratids and 
a simple readcnit that is easily transferred to a computer for 
storage and analysis; no measurements of DNA fragment 
sizes are necessary. All of the tested sequence variants 
(nucleotide transitions and traosversions, and a deletion) 
could be discriminated by OLA using a standard set ci 
conditions. The initial PCR amplification facilitates the dis- 
crimmattoii <tf potymorphisms in individual members of a 
multigene family (e.g. , the TCR gene segment is one of 
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Expected haplotypes were calculated assuming random allelic 
association— e.g.» AAA » 0.66 x 0.83 x 0.66 x 192 - 69. 

nine highly similar members of the subfamily). The two 
successive levels of sequence discrimination, PCR and then 
OLA, enhance signal/noise ratios and reduce the h'kelihood 
of error, partkulaily in the anajysis of polymorphisms in 
multigeoe families. The aeps in the assay arc automatable, 
eliminating the need for human intervention (and possible 
mistakes) in a tedious and repetitious process. With autoniar 
tion, high throughput is possible. At present, we can pix>cess 
1200 ligation reactions per day with a single operator and 
roi^tic workstation, and, in the near future, further automa- 
tion with a. robotic arm will permit processing of 6000 
reactions per day. 

The automated PCR/OLA assay can be applied in many 
different basic research and clirtical areas. Genetic diseases 
fall into several different Categories including the common 
and widespread liiutattons of sickle celt disease, ai- 
antitrypsin or CF. and newly arising spontaneous mutations 
such as Lesch^yhan disease (3j). Cleaily, PCR/OLA fa- 
cilitates the aimlysb jbf the common mutations, either in 
screenii^ at-risk members of families with diseases or for 
more general carrier screening purposes. Rapid techniques 
ao'e being developed to identify the sequence variations of 
newly arising mutatibns (55, 36). Once identified, the cm- 
bined PCR/OLA procedure can be used to follow the inher- 
itance of these specific mutations in affected families. Many 
genes cause a predisposition toward disease. This is true oif 
the Qi-antitrypsin mutation described above, kecentty, it has 
been demonstrated that certain TCR and HLA haplotypes 
tnay predispose human.s to certain autoimmune diseases such 
as multiple sclerosis (37-39). Therapeutic strategies are Being 
devekiped to circumvent these predispositioAS (40-42). 
Therefore, automated screening may be Useful in the near 
future to identify the genes associated with disease predts- 

100 kb 





Fic 4. Schematic diagram of the human TCR^ locus giving the relative order of the V, D, i, and C gene s^ments. DNA polymorphisms 
in three indicated gene segments were analyzed in 96 individuals. Their location, where known, is shown (arrow up). The nucfeottde substitutions 
analyzed and the freQuency for each variant in these samples are shown. 
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positions in which some form of preventive therapy can be 
initiated. 

The automated FCR/OLA procedure provides a powerftil 
approach to hifib-resolution genetic linkage mapping of the 
human genome or other complex genomes. For this ap- 
proach, sequence-tagged sites (STSs) (43) from speciftc chro- 
mosomal regions (e.g., the TCRjS locus) or from a specific 
chromosome (e.g., STSs obtained from random clones of a 
flow-sorted chromosome tit»ary) would be scanned for in- 
ternal DMA sequence polymorphisms (9-11) to obtain a set 
of polymorphic STSs. Once acquired, polymorphic STSs can 
be rafHdIy ordered by analysis of large multigeneratifMi fam- 
ilies or by single-sperm typing (44. 45) using the automated 
PCR/OLA system. 

The availability of human p<dyfl»>rphtc STSs will also 
provide a set of markers for automated forensic typing. For 
example, with a set of maximally informative biallclic mark- 
ers (50:5() dtstrdyutk>h in random mating populations) from 
each of the tl huqi^ autosomes,' the probability that two 
individuals would have identical DNAfingerprints— i.e., the 
same set of the 44 alleles— is in 10"*. The automated 
PCR/OLA procedure eliminates most of the iimitatk>n3 as- 
sociated with forensic typing by cpoventiohal Southern blot 
analysis (e.g., the measurement of DNA fragment sizes» the 
requirement for high quality DNA, and the use of radioiso- 
topes). 

Other applications for automated DNA diagnosis by the 
PCR/OLA procedure include HLA typing, the analysis of 
recessive or dominant oncogenes, and the identiificatton of 
infectious pathogens. The use of commercially available 
thermostable ligases and automated ligation amplification 
reactions in the direct detection of single copy genes caii also 
be explored. Moreover, multiple noni^otopic reporter groups 
may be developed that will be simultaneously analyzed in a 
single microtiter well. This raises the possibility of multi- 
plexing the OLA iHOcedure to the point where initially both 
aEeles can be analyzed together and eventuaOy multiple 
biallclic loci can be typed in a sin^e well. These and other 
improvements, such as a single instrument to perform the 
entire analysis, will greatly increase the throughput and 
potendal applications of automated DNA diagnostics. 
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generated a subscrate chat was extended bv 
the polymerase to a complete 50-bp duplex 
molecule (Fig. 4). This confirms the result 
shown in Fig. 2B that Radl-RadlO removes 
the 3' single-stranded tail, and indicates 
chat Radl-RadtO cleavage produces contain 
3'-OH groups, the required substrate for 
extension by CWA polymerase. Hence, 
E^l-RadlO endonuclease products are ajit- 
able substrates for a t^essary subsequent step 
in both die SSA recombir\ation and NER 
models. 
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Padlock Probes: Circularizing Oligonucleotides 
for Localized DNA Detection 

Mats Nilsson, Helena Malmgren, Martina Samiotaki, 
Marek Kwiatkowski, Bhanu P. Chowdhary, Ulf Lahdegren* 

Nucleotide sequence infomnatton derived from DNA segments of tfie human and other 
genomes is accumulating rapidly. However, it frequently proves difficult to use such 
short DNA segments to identify clones In genomic libraries or fragments in blots of 
the whole genome or for in situ analysts of chromosomes. Oligonucleotide probes, 
consisting of two target-complementary segments, connected by a linker sequence* 
were designed. Upon recognition of the specific nucleic acid molecule the ends of the 
probes were joined through the action of a ligase, creating circular DNA molecules 
catenated to the target sequence. These probes thus provide highly specific detection 
with minimal background. 



The application of synchecic oligonucle- 
otides in combination with nucleic acid- 
specific enzymes has brought simplicity 
and convenience to molecular genetic 
analyses. There is» however, a need for 
methods in which oligonucleotides can be 
used for localized detection of single-copy 
gene sequences and for distinction among 
sequence variants in microscopic specimens. 
Such methods would help to bridge the 
analytic gap between specific gene se- 
quences and subcellular structures. We have 
developed oligonucleotide piobe molecules 
that should be useful for localized detection 
of specific nucleic acids. These **padlock" 
. probes are composed of two tatget-comple- 
mentary segments, connected by a linker 
that may cany detectable functions. The 
two ends of the linear oligonucleotide 
probes are brought in juxtaposition by hy- 
bridization to a target sequence. This jux- 
taposition allows the two probe segments to 
be covalently joined by the action of a 
DNA ligase. Because of the helical nature 
of DNA, circularized probes arc wound 
around the target strandt topologically con- 
necting probes to target molecules through 
catenation, in a manner similar to padlocks. 
The requirement for simultaneous hybrid- 
ization of two different probe segments to 
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target molecules provides for high specific- 
ity of detection in complex populations of 
nucleic acids ( 1 ). Moreover, the act of liga- 
tion permits facile distinction among simi- 
lar target sequence variants as rerminaliy 
mismatched probes arc poor substrates for 
ligases (i, 2). Finally, the covalent catena- 
tion of probe molecules to target sequences 
describe here results in the formation of a 
hybrid that resists extreme washing condi- 
tions, serving to reduce non^>ecific signals 
in genetic assays. 

Probes useful for circulahzacion experi- 
ments were constructed by solid phase syn- 
thesis of oligonucleotides that contained 
two hybridtziixg regions of 20 nucleotides 
each, connected by a 50-nucleotide-long 
linker segment (Fig. 1). Phosf^ate groups 
were added at the 5' ends of the molecules 
as required for enzymatic ligation. Alterna- 
tively, residues of hexaethylene glycol 
(HEG) were incorporated in the linker seg- 
ment during standard solid phase syndiesis 
(3). The HEG residues served to reduce the 
number of synthetic steps required to span 
the ends of the two target-complementary 
segments. 

Cyclizable probes were designed to de- 
tect a 40-nucleotide target sequence, rep- 
resented either by an oligonucleotide 
molecule or by the poly I inker sequence of 
the single-stranded form of the circular 
cloning vector M13 mpl8. Ligation prod- 
ucts could be separated by denaturing 
polyacrylamide gel electrophoresis (Fig. 
2 A). In the presence of the oligonucleo- 
tide target, linear probes were efficiently 
convened to circular molecules with a 
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dtscinct rate of migration. Probes interact- 
ing with M13 target molecules were con- 
verted to a species catenated to and there* 
fore migrating with the large Ml 3 mole- 
cuk during dct\aturing gel electrophore- 
sis. As the probes were labeled by the 



addition of a radioactive phosphate group 
at the 5' terminus, only ligated molecules 
retained their label after treatment with 
alkaline phosphatase. Ciccubr oligonu- 
cleotides are insensitive to digestion with 
exonuclease VII, which attacks at free 5' 



or 3' ends of DN A strands (4). Depending 
on how the probes are labeled, phospha- 
tases or exonucleases could be used to 
remove any signal arising from unreacted 
probes tn various assays, thus reducing 
backgrourui (5). 

We also investigated the consequences 
of cyclically repeating the probe hybridiza- 
tion and ligation reaction. The amount of 
cyclized probe molecules increased linearly 
with the number of ligation cycles when a 
short oligonucleotide target was used (Fig. 
ZB). By contrast, under the same conditions 
the maximal number of probes were bound 
to the closed, circular M13 target m<^ecule 
in a single ligation cycle; thereafter the 
signal decreased, probably because of scis- 
sion of the single-stranded target molecule 
during heat denaturarion. Thus, a single 
probe may be catenated to each circubr 
target molecule. This ttKltcates that circu- 
larized probe molecules, constrained to one- 
dimensional dtfYusion alor^ the target * 
strand during heat denaturarion. rapidly oc- 
cupy the correct target sequence before 
new prubes bind to this sequence when 
the temperature ts lowered. Repeated cy- 
cles of ligation can, however, increase the 
prDbability that any target sequence will 
be detected by probe molecules specific 
for that target, particularly when allete- 
specific probes are used to distinguish 
among sequence variants. 

Investigators can use oligonucleotide 
probe ligation reactioris to distinguish 
among related DNA sequences by study- 
ing their ability to serve as templates for 
ligation of oligonucleotides complemen- ' 
tary to one or the other sequence variant 
(1). Whereas probes specific for one of 
the two sequence variants may hybridize 
stably to either of the two sequences, only 
target molecules correctly base-paired to 
the Juxtaposed ends of the probes can 
assist in the ligation. We investigated the 
capacity of the padlock probes to distin- 
guish between a normal and a mutant 
DNA sequence in plasmid clones immo- 
bilized on nylon membranes (Fig. 3). Plas* 
mids containing the AF508 variant of the 
cystic fibrosis tr ansm embrane conduc- 
tance regulator (CFTR) gene or die cor- 
responding normal gene segment w«fre 
^tted on nylon membranes and subject- 
ed to probe hybridization and ligation. 
The mutation removes 3 base pairs (bp) 
(6) corresponding to the 3' end of the 
circularizable probe. Probe molecules spe- 
cific for the normal sequence gave rise to 
a signal only when reacted with the nor- 
mal sequence but ru>t with the AF508 
variant of the CFTR gene when probe 
ligation was followed by denaturing wash- 
es in 0.2 M NaOH for 5 min. This strin- 
gent wash (to interrupt hybridization be- 
tween DNA molecules) permitted effi- 



Flg- 1. Structure of a 
pacflock probe interact- 
tng with its target se- 
quence. (A) Molecular 
model of the probe-target 
compfex. The molecular 
model was prepared on a 
Silicon Graphice work- 
station, with Insfc^t II 
(Biosym Technologies}. 
(B) Sequence composi- 
tion of a probe, specif 
for a segment present In 
the M13 cloning vector 
sequence. A! the 5' end 
of the probe, beginning 
with a phosphate group. 
20 target-conplementa- 
ry nucleotide positions 
ere shown In red. O'rectiy 
contiguous with these is 
a linker segnr^enl of 50 T residues, shown in ^een, FmaDy. the 20 nucleotides at the 3' end of the 
probe are yellow. The target sequence fs shown \r\ blue. 




n^. 2. Analysfs by gel electrophoresis of the tar- 
get-dependent orcutarization of an oSgonudeo- 
tfde probe. (A) A go-bp <rii9onucleotide probe ^' 
TGCCrrGCy^GGTCGACTCTAQ{T)(jo<?GGCC^ 
GTGCCAAGCTTQCA-3\ see atso Rg. 1^ was 
designed such that its 5' snd 3' ends wo^ hy- 
t)ridize acQaoent to esch other k) a serpent ^ ttie 
polytirto- re^on of the M13 mplddonlr^g vector. 
The probe was ge(-pL7tfied arKJ 5'-phosphorytat- 
ed by T4 polynuGfeot^ k^iase (New En^and Qo- 
labs) and y^-ATP (3000 O/mmol. OuponO. To 
ensue that most or aa 5' ends were phosphoryt- 
ated, a seoon6 Idnase incU)ation was perfomied 
in tfw prasence of a 20-lDld excess of 8derx>S8ie 
triphosphate (ATP). The taMed probe (6 pmof) 
was hcubated wfth 3 pmot Of either of two differ- 
ent templmes: the 7.2-kb. sln^strvKled. circu- 
lar M13 mp18 motecUe or an ol^onudeotide 
{5'-TTTTTCTAGAGTCGACCTGCAGGCATG- 
CAAQCTTQCaCACTQGOCGTTTTT.a') that con- 
ts^wd the same 40-bp targ^ sequence, in 100 
of 20 rrM tiis-HQ (pH 8.3). 25 mM KCl, 10 mM 
Mgda. 1 mM NAD*, 0,01% Trfton X-100, end 
200 U Of Ampfigase (Epic^ttre Technologies The 
reactions were heated to (1 rnin), then 
cooled to f5 min} ffxj chOed on loe. Sannples 
(10 td} were taken from the figatkm reactions aid 
treated with ettho^ 0.5 U of caff Intesttnal afkafine 
phosphaiaseCCiP; New England Bolsi)^ or 0.1 u 
of exonuclease VII (Bco VU: Gibco/BRl)- (B) The 
same probe (9 pmoQ was sui)iected to repealed 
cycles of igation. separated l3y heat denatiffatton 
steps. k\ the presence of 0.3-pmol ofigonucleotide 
target (open c&das) or the Cffci^ sin^strsxied 
target moteode ped circles). Radioadive ttgation 
pnx^ictB. aocumiiated alter tna Indteated nunv 
ber of cycles* were separated by gel electrophore- 
sis Ctf) a 6% denaturing poly^sytamide gel and 
quantitated wdh a Phosptomager (Molectiar Dyria^ 
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cient dwtincrion between the allelic vari- 
ants, as only cyclizcd probes remain 
bound to the membrane. By contrast, a 
stringent but nondenaturing wash of the 
same probes in a solution of 2% SDS in 
O.l X standard saline citrate (SSC) gave 
poor distinction between the two target 
sequences. Because signal strength is pre- 
served under conditions that prevent hy- 
bridization between complementary DNA 
strands, nonspeciftcally crapped probe 
molecules may be efficiently removed, re- 
sulting in a reduction of the level of 
background in gene detection reactions. 

As indicated in Fig. 2B, circubrtzed' 
ptobc molecules are free to travel consider- 
able distances along die target strands dur- 
ing denaturing washes. To measure the dis- 
tazKe traveled, probe-cyclization reactions 
were carried out on equivalent numbers of 
covaiently closed tar:get molecules or mole- 
cules diat had been linearized at variable 
distances fnm the prt^)e<omplementary 
sequence before being immobilized on ny- 
lon membrartes (Fig. 3B). Few probe mole- 
cules that were cyclized around target 
strands interrupted approximately 150 nu- 
cleotides from the probe-complementary se- 
quence remained after denaturing wa^es. 
By contrast, strands digested 850 ruzcleo- 
tides from the probe-complemeitt rctairied 
simibr numbers of probes as did uninter- 
rupted strarvds. The greater preservation of 
signal upon denaturing washes of probes 
bound to the lor\ger littear target molecules 
probably reflects die increased likelihood 
that target molecules were cross-linked to the 
membrane on both sides of d>e sice where the 
probe was catenated. This trapping of circu- 
larized probes by catenatson to linear target 
molecuks, in combination with the specific 
detection afforded by tltt requirement that 
two different probe segments simultaneously 
react with the target sequetKe, should be c( 
vahie in pnxsdures such as DNA blotting or 
kx screening genomic libraries with short 
pfobe seqi^nces. 

Currently, oligonucleotide probes find 
limited applications for in situ analysis of 
gene sequences in metaphase chromo- 
somes. This is a consequence of problems 
both with specificity of detection and 
sensitivity of visualization. A circulariz- 
able probe, specific for a repeated centro- 
meric motif characteristic ofhuman chro- 
mosome 12 (7), was used for in situ hy- 
bridimtion followed by ligation in human 
metaphase chromosome preparations. A 
wide range of washing conditions, includ- 
ing or^ that remove specifically hybrid- 
izing oligonucleotide or longer probes pre- 
served signab from in situ circularized 
probe molecules and permitted efficient 
distinction from alphoid repeat sequences 
present on other human chromosomes 
(Fig. 4). Given sufficiently sensitive tech- 



niques for detection of probe molecules, 
the high specificity of padlock probes in 
conjunction with the reduced non^>ecific 
background observed should permit de- 
tection of short, single-copy DNA se- 
quences in human chromosomes. In- 
creased signal could be obtained by sec- 



ondary ligation of detectable irK»lecules to 
the linker segment of bound probes. Thus, 
oligonucleotide probes could be used to 
screen for the presence of known muta- 
tions in loci distributed along the chro- 
mosomes, by mearts of color-coded probes 
specific for normal and mutant sequence 
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3i. DtslBXlxyi of tanget DNA rndecuias immobBz^ 
probe. (A) Rfleen femtomcles of two plasfrtds containing the normaJ or a 3-bp deteted variar^ of tha CFTR 
gere vwre spotted on nyton rrombranes PALL^ T>» fitere tr^ 
left for 1 0 rnh ^ room terr^seratunB; Rters vvena ihen washed 
to remowe piasmids that had roi been fb«ed to themembrana T?^f^^ 

TOOTOmcxnATOACfriBSJjC^ per frtanocter was hybrttzed to the 

fTierrtxanssfor 30 mh h 5 X SSPE (9); 5 X Derihardrs soMion (9K and safr^ 
prte com^ied ^M^-fno(£fted CfBGiduea to which bi^ 

Ponetech Laborator»s|| es descra)ed (70). NbO, the men^aranes wars todasted for 1 at roxn 
temperature h a soUion of 1 0 nnM tris, pH 75. 1 0 mM Mg(Ac>2, 50 
0J5 U o< T4 Dr4Afi9ase per ntotter (Pham^ia^ Tte me^^ 

1 xSSPEibrd0n^n8Ktineither2%S0Sin0.1 xSSOtor30mlnfixastrfnsemwa&^or,«oradena^^^ 

wash.ha2MNaOHfor5mtrt,andlheninl x SSPE, 2% SC36, tor 30 nnbi. A signed was genemtBd by 

tnoubating the menrttxanes for 5 nrtn in strepta^/k^ 

mger Ktennhe&nJ in 2 X SSPE. 2% SOa mshg h PBS for 30 to 60 min, ^ 

(Amereham)faf IminThechemolumfrTescentsig^ 

tlie nomr^ {KO or nuitartt (M) variants of the tsffget rno^^ 

incficated dSstanoes from the sequerce oon^iiemeritafy to 

zatlon on riyton rnerntxanes; the ptasTTiids weiB probed by hy^^ 

tbSovved by a i9Btx)n step erxJ a deritiuriiig wash in 0^ M 

Fig. 4. Detection of a ctvomoGome 12-6pecific 
repeated sequence in human metaphase chro- 
nrK)sorne$, by in situ hytxic&ation and Bgalion of a 
t>iotinylated ctra&rizable probe. Metaphase 
chrornosome preparations were obtained from a 
human lymphocyte CLriUx^ by staridard tech- 
r^ques Of cotoemfde treatment hypotonic shock, 
and fbtation h methanol -f acetic acid, fn situ 
hyt)fidizatlon and figation were performed by a 
mocSficaticm of the procedure described (7 1 ). The 
sUddS were treated vwth rbaneudease A at 200 
vJQfiTA)n2 X SSC (9) fori hour at arc, dehy- 
drated fai a series of 90. 95. and 99% ice-cold 
ethand washes for 2 nr^ eacht end air-dried. The 
chromosome prepsations ware then denatured 
in 70% fwmamide. 2 x SSC at 7(fC for 2 nw; 
immedtatetydehydmtddinaseHesolTO. 90,95. 
and 99% ice-cotd ethar>d washes for 2 rriffi each; 
and ^-drled. Ofcidartzable probe (10 liniot^(i9 specific fo^ 

some 12 (5'.P AAATCTXAACTGGAAACTG ({HEG)a(C-q)MHEG^ ATTTGGTCTCAAAGTQATT3-30 

was hyt)rfcfized for 18 hours at 37<t: 2X8^. 20% formamide and sairnon sperm Dh4A0 ^iU)^a 

25-MJvotjineoneachs6daA5-minwash»i2xSSCat37*CandabrfefwaEhh10mMtrl8.pH 10 

nrMM9(Ac)2.50mMKAc. 10 rrM ATP preceded Bgatkxi In the sarne bufl^. oorttalnir^ T4 Of^l^^ 

(a085 lO/pt^ for 1 hotf at 37^. The sSdes were washed twice 2 X SSC wim 20% fw 

farSmin each, foOowed by two washes in 2 X SSC and once mPN buffer (p.1 fWrsteHjP04 0.1%NP-40. 

ac^usted to pH 8.0 with o.l M NaJHPO^ at 37*C. 5 mineach. Bound probes were visualized by means 

Of IKmscein-labe^ avidan. foBowed by aiayv 

(Vector laboratories), srd a second layer of fluoresce&iated avidin. ^ 

buffer contaWng 5% nonfet mttk et 37»C for 20 mfn foSowed by three washes in PN birffer &t room 
terT^>erature for 5 rnn arch. The metaphase chromosomes were sta^ wtth propicSun kxfide cffid 
photofflaphed with a Nikon Axtofotmicroscopfe 
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variants. Furthermore, probe cyclizatton 
reactioiu depend on an intramolecular 
reaction as opposed to reaction between 
pairs of independent probe molecules as 
in amplification by the polymerase chain 
reaccktfL Thus» thete should be fewer pzob- 
lems with nonspecific reactions resulting 
from intetactions between noiuognate pairs 
c{ probe segments with cyclizable probes. 
The present probe design should pennit the 
sinuiltaneous analysis of multiple gene se- 
quences in a DNA sample. 

In conclusion, die nucleic add probe pre- 
sented here permits higWy specific detection 
of nudeotick sequences arnd, ahhou^ ihc 
target is not an^ltfted, highly senative detec^ 
tion B possible through efficient reduction of 
ncmspeciftc signal. Qccularizable probes 
should be applicable in a number of other 
contexts, including the detection of specific 
RKA molecules expsessed in tissue sections as 
T4 DNA ligase can assist in ligation reactions 
involving RN A strands (8). Moreover, immo- 
biliaed padlock pid^es could be useM for pre- 
parative purposes, such as trapping circular 
target molecules from solution when screen- 
ing gette lifaiarie&. 
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Localization fa Breast Cane rSusc ptibility 
G n , to Chrom som 13q12*13 

Rk^ard Wooster,* Susan L Neuhausen,* Jonathan Mangion/ 
Yvette Quirk,* Deborah Ford* Nadlne Collins, Kim Nguyen, 
Sh ila Seal, Thao Tran, Dian Averill, Patty Fields, Gill Marshal, 
Steven Narod, Gilbert M. Lenoir, Henry Lynch, Jean Feunteun, 
Peter Deviiee, Cees J. Comelisse, Fred H. Menko, Peter A. Daly, 
Wilma Ormiston, Ross McManus, Carole Pye, Cathryn M. Lewis, 
Lisa A. Cannon-Albright, Julian Peto, Bruce A. J. Ponder, 
Mark H. Skoinick. Douglas F. Easton,t David E. Goidgar, 
Michael R. Stratton 

A smati proportion of breast cancer, in parttcuiar those cases arisfng at a young ag , is 
due to the inheritance of dominant euscept'tbiHty genes owiferring a high risk f the 
disease. A gertomic linkage search was perf onmed with 1 5 htgh-rlsk breast career fiamilles 
that were uilbiked to the BRCA 1 locus on chromosome 17q21 . TOs anafysis localized a 
second t>reast cancer susceptibilfty locus, BRCA2, to a G-centimorgan interval on c^ho- 
mosome 13q12-13. Preifmiriary evidence suggests that BRCA2 confers a high r^ of 
breast cancer but, unlike BRCA 1 , does not oorrfer a 5ut>starttiany elevated rtek of ovEvian 
canco'. 



In 1990, a breast cancer susceptibility gene, 
known as BECAlt was bcaiized to chiomo- 
some ]7q ( I ). Subeequeiit studies demonstrat- 
ed diat BRCA] accounts for mcst families 
widi multiple cases of both ea^-cmset breast 
and ovanan cattcer and about 45% of families 
with breast cancer only, but few if any families 
with both male and female breast cancer (2). 
Several other genes can confer susceptibility 
to breast cancer. Germline muEttions in the 



a VVcnstar. J. Mar^loa Y. QiAk. N. CoKns. S. Seal, M. 
R. Stralton, Ssctlon of Molactiar CapdnoQsnfisfs, 
tutd or Cancaer Reseercfi. Sutton. Sumgy SM2 5NQ. UK. 
a L Naiiausen, K. Nguyen, T. TTaa P. HeUs. C. M. 
LbanIs, M. H. SkdricK 0. E Qcldgar, Department of 
Ktoc»C8tlrtfcimiatlc9.Unh«rsfly(tfLR^ 
84108. USA. 

p. Fold. D. AverfB. Q. MsshaB. J. Pieta a F. Easton , 
Soctkm of ^fitSenvtitoifff, tnstftuts cff Csncor Resosrch, 
Sutton. Surrey SM2 5NG. UK 
S. Narod. Oflpartment of Moddne. OMslon of MbOoA 
Qereecs and Qvtston of Human Genetics, MoGS UnK«r- 
sity, Montreal. Canada K3Q 1 A4. 

G, M . Lflnclr. trtternalkriaft Agency tor ftesaareh on Can- 
cer. 150 CouQAbert-Thofnas, 69372 Lyon CedOK 08. 
ftanoe. 

H. L^nch, Departmant of ftsvsntlwe MecAdne and 
Hsflfth, Orei^^iton Unh/Cfsfty Scfool of Md(Sctnd, Onrtaha, 
NE 681 7a USA. 

J. Feuntam. InstftutaGustav-Roussy, VSeMf. franca. 
P. Devesa and C. J. ComeBsse. Oopflrtmems or PsM- 
ogy and Hiffnan GmUcs, Uniwer^ of tekton, 2333 AL 
Lfiiden, H8th6riiMl&. 

F. a Menifio. oepanmem of cmcai Genetics, Free um- 
varsity of AmstandBm. 1007 MB Amsterdam, f^ether- 
lands. 

P. A. Oaty. W. Crmisten. R. McManta. Oapartment of 
Medlciw. TrMty CoOsgaMetficat Scho:f. SL James Koa- 

C. ^ and a A. J. Pander, CfC Human Cancer Genet- 
ics Group, Oepartmant of PatholoQy. Utff«rstty of Cam- 
trt^CambridgpCBZIQP.lttC 
L A Cannon-AQyiQliC. Oepartmart of tntemal MecSc^. 
Untwrafty of Ut^. Salt Lake aty. in* 84132. USA. 

*TTiese a/ttvvs oontrKxAed equsi^ to XNs study. 
tTo whom correspondence Bhotid ba addrassad 



p53 geitt on chromasoinc 17p cause a wide 
range of neoplasms including early-onset 
breast cancer, sarcomas, brain tumors, 
mias» and adrenocortical cancer (3). Certain 
rare abnomsalities of die androgen receptor 
flfipear to be essoctaced with breast cancer in 
men {4), ffiid epidemiological studies have 
suggescsd thauc hetetoz^^otes for dve ataxia 
telangiectasia gene, AT, on chromosome 
llq are at elevated risk of breast cancer 
(5). However, mutatioiu in p53 and AT 
can otiiy be responsible for a small minOT- 
ity of breast cancer ^milies that are un* 
linked to BRCAl (6). 

To localize other genes that piedispose 
to breast cancer* we peifbimed a genomic 
linkage search using 15 fmnilics that had 
multiple cases of early-onset breast cancer 
and that were not linked to BRCAl. These 
^milies were classified accoxding to the 
number of cases of female breast cancer, 
male l»east cancer, and ovarian cancer (Ta- 
ble 1). In addition to a negative led score 
(logarithm of the likelihood ratio for Ull- 
age) with markers flanking BRCAl, all but 
one of the families used for this study had at 
least one breast cancer case diagnosed be- 
fore age 50 that did not share a BRCAl 
haplotypc with other breast cancer cases in 
the family. The excepticm, CRC I36i had 
an obligate sporadic case diagnosed at age 
53- Families were genocyped with polymor- 
phic microsatellite repeat mariceis (7» 8). 
Typing of the maricers D13S260 and 
D13S263 provided provisional evidence for 
the preseitce of a susc^bility gene on 
chromosome 13, which was subsequently 
confirmed by analysis of additional poly- 
moif^isms in the region. 
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SNP attack on complex traits 

Single nudeotlde polymorphisms (SNPs) are major contributors to genetic varia- 
tion, comprising some 80% of aJl known polymorphisms, and their density in the 
human genome is estimated to be on average 1 per 1 ,000 base pairs. Although SNP^ 
are mostly biallelic — and consequently less informative than microsatellite markers 
— they are more frequent and mutational ty more stable, making them suitable for 
association studies in which linkage disequilibrium (LD) between markers and an 
unknown variant is used to map disease-causing mutations. In addition, because 
SNPis have only two alleles, they can be genotyped by a simple plus/minus assay 
rather than a length measurement, making them more amenable to automation. 

These are good reasons to develop SNPs as usc^ markers, but hardly sufficient 
to explain the momentum that the SNP movement has recently acquired, which 
stems from the hope that SNP- based approaches will lead to progress in the search 
for genetic variation associated with common diseases or sensitivity to drugs. At a 
recent meeting . advances in SNP technology and SNP-based approaches to tackle 
complex traits as well as questions of human origin and prehistory were discussed. 
Frustrated with linkage analysis, which has had iittle success in identifying genes 
involved in determining complex traits, many geneticists have turned towards 
association studies which might be better suited to detecting genetic effects of low 
penetrance with higher resolution. For such studies, marry more markers will be 
required — in addition to better statistical toob and high-throughput low-cost 
gcnotyping technology to analyse large marker sets in many samples. 

Increasing amounts of sequence data available in public and private databases, 
(within which SNPs can be discovered in siliar, Pui-Yah Kwok, Macdonakl Morris), 
efforts underway to re-sequence DNA stretches from several individuals, and the use 
of *SNP discov^' technology (such as d^aturing high performance liqukl dtromo- 
tography; Peter Underbill), l^ve led to the rapid accumulation of catalogued SNI^ 
So far, no SNP has been patented, but a number of applications are pending (Christ- 
ian Stein), and it seems likely that many will end up in pmprletary collections. Even 
with the best tools, understanding complex traits and human variation will be a chal- 
lenge, to say the lease sharing resources will help. IWo publicly avaitaUe SNP data- 
bases as weU as several SNP collections exist at present (see box) — and researchers are 
encouraged to submit any SNP that they discover. 

The technological and economic goal is accurate, easy, cheap and fast large-scale 
SNP genotyping. Several methods are currently beiitg developed, and it is unclear 
which one(s) will turn out to be the best. Examples based on minisequencing on 
DNA arrays (Ann-Christine SyvSnen. Andres Metspalu). dynamic aUele-specific 
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SNP databases 

HGBASE (http://ligbaso.interact 
tva.de) coltects intragenic SNPs and 
contains approximately 2,700 entries. 
It is searchable by sequence and. at 
the momOTt the only database 
where information can be deposited 
and retrieved. 

dl^P (http-y/www^ncbl.ntnr>.hth. 
' 90v/iNP/>, a jcpt effort by the 
NHG» and tHeiKlGBi; is now accept* 
ing submbsions:; Its orators are stiQ 
wdHiing dfi nl^hg content 8>i^ii3b|e; 
the da^t»se b^^ 
accessioh riumberahdf^ integral^ 
with GenBahlt^ 



jhe Mrr^^ip d^ 

(appiriwiirt^^ <rf t^i^m : 

rrapf>e0{andJs 
if^J<OT or in^^ 

.)wiirarJbc^viniisi];e^^ contains 
several fefidre^ 

Ventty tKung- tiitegrate^ kud dbSNP. 



hybridizati n (DASH, Anthony Brookes), microplate array diagonal gel 
electrophoresis (MADGE, Ian Day), pyrosequcndng (PAl Nyren). 
oligonucleotide-specific ligation (according to Ed Southern, the most sen- 
sitive assay) as well as the Whitehead/Affymetrix SNP chUps Ql^i^'Bing 
Fan) and the TaqMan system (Ken Livak) were discussed. All of them 
require target amplification of each SNP by PGR. Even in the light of 
encouraging progress in multiplexing PGR (Michelle Gargill), a large 
number of individual reactions is required and the cost is considerable 
Games Weber). Ideally, one would like to determine the genotype directly 
from genomic DNA. Methods based on the generation of small signal 
molecules by invasive cleavage followed by mass spectrometry fDmothy 
Griffin) or immobilized padlock probes and rolling-circle amplification 
(Ulf Landegren) might eventually eliminate the need for PGR. 

Apart from the challenges of generating SNP maps and efficient geno- 
typing, how easy will it be to determine which SNPs are suitable for a 
particular question and how best to analyse the data? In the absence of 
understanding what makes complex traits compleXt classical mendelian 
concepts (two alleles, normal ve/:nis abnormal) are usually imposed onto 
a more complicated reality. Joseph Terwilliger warned that only if the 
genes underlying complex diseases have one wild -type and one (or one 
major) susceptibility allele — that is. when allelic heterogeneity is low — 
is statistical analysis likely to detect association of the causative allele (or 
linked markers) with the disease phenotype. Intuitively, more markers 
should allow increased accuracy, but in statistical reality, this also means 
larger samples will be necessary or the risk of obtaining false positive 
results will increase. Skeptical about the use of SNPs in disease genetics. 
Terwilliger is nonetheless enthusiastic about their potential use in population 
genetics and genetic epidemiology. By way of contrast, Marta Blumenfeld and Nik 
Schork described a strategy by which they can overcome many of the statistical 
obstacles of SNP-based association studies. By sequencing DNA from a minimum 
of 100 individuals to establish SNP allele frequency, calculating LD strength in a 
region of interest prior to determining how many markers are needed, and 
analysing haplotypes (2-6 SNPs together) instead of individual markers, they 
have been able to identify new genes associated with complex traits— Hinfortu- 
nately the identities of the genes were not disclosed, and so proof of principle is 
yet to be provided. 

Although the jury is still out on whether SNPs will provide easy answers to 
complex questions, they are increasingly popular with disease and population 
geneticists. While the former mainly concentrate on SNPs Mithin or close to 
genes, the latter often prefer markers outside of genes (to avoid selection) and in 
areas of the genome devoid of recombination. Several approaches using SNPs on 
the Y chromosome (Ghris Tyler-Smith, Francesc Galafell) and in a low-recombi- 
nation interval on the X (Svante Paabo) provide interesting leads on human his- 
tory, as well as data about age. frequency and population distribution of SNPs. Of 
course, this is information directly relevant to disease geneticists, and underscores 
the need for more interaction between population and disease geneticists 
(Andrew Glark, Rosalind Harding). Knowledge about population evolution and 
history will reveal suitable populations for genetic studies and aid in study design 
and interpretation of results. 

Time — or rather data — will tell whether SNPs live up to expectations. 
As Aravinda Ghakravarti stated in his abstract. *'£ad) genetic approach, 
considered either optimistic or pessimistic, has its underlying assumptions. 
Human geneticists have to begin to test these assumptions not by computer 
simulations and theoretical arguments but by empirical observations*. 
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Large-Scale identification. Mapping* and 

Genotyping of Single-Nucleotide 
Poiymorphisms in the Human Genome 

David G. Wang» Jian-Bing Fan, Chia-Jen Siao, Anthony Bemo» 
Peter Young, Ron Sapolsky, Ghassan Ghandour. 
Nancy Perkins, Ellen Winchester, Jessica Spencer, 
Leonid Krugtyak, Lincoln Stein, Linda Hsie, 
Thodoros Topaloglou, Earl Hubbell, Elizabeth Robinson, 
Michael Mittmann. Macdonald 5. Monis, Naiping Shen, 
Dan Kilburn. John Rioux, Chad IMusbaum, Steve Rozen, 
Thomas J. Hudson. Robert Upshutz," Mark Chee, 
Eric S. Lander* 

Singte-nucleotidB polymorphisms (SNPs) ar« the most frequent type of variation tn the 
human genome, and they provide powerful toot:; for a variety of nnedtcal genetic studies, 
rn a laige-scale survey for SNPs. 2.3 megabases of ht^nan genomic ONA was examined 
by a combtnacion of gel- based sequencing and high-denshy vanaiion-deieotion dna 
chips. A total of 3241 candidate SNPs wene identrfted. A genetic map was constructed 
showing the tocation ot 2227 of these SNPs . Prototype genotyping chips were developed 
that at^ow simuttansous genotyping of 500 SNPs. The results provide a characterization 
Of human dtversfty at the nucleotide level and demonstrate the feasibilfty of large-scale 
idemlfrcation of hunan SNPs. 



AuhougK the Human Gcnotn« Project still 
Ka> eremenctous work ahead to pro^e the 
fn>t complete tcfwence sequence of the 
human cKfomo9omc$> attention is already 
fbcusinji on the challenge of latiEe-scalc 
charaaerizac»on of the sequence vartstion 
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among ixidividuftU (1). This senecic dWcr* 
sity i& of interest hecause it explains the 
basis <f heritable variation tn disease ms- 
ceptibt)«tYi fts well as harbors a record of 
human mi^rion^. 

The most comnio*^ type of hutnan genet- 
ic variation is the SNP, a position at which 
two akcn\acivc bases occur at appreciable 
frequency (> 1%) In the human population. 
There has been icrowing recoicnftian chat 
large colWctioru of mapiped SNPs would 
provide 9. powerful tool for human genetic 
snidies (J. 2y SNPs can serve n% genetic 
marWers for identifying disease genes by 
linkaiRe siudia in families, linkage disequi- 
librium in isolated poputactons. association 
itnatysts of patients and controls, end loss- 
of*heteroivgo&t.ty studies in tumocs (1,2). 



Alihoi/ph individual SNPs are less inftwia- 
iii*c ih:in curfcntly «<ed i^^etic markers 
{>), they i»re more ;*hwndant and hjjve 
Rrcircr pnccnti?) ft'^r mitomaclot* (4, 5). 

Wc pcrformciJ initial survey idcn- 
iify SNPs by ii.<ing convcprinnal gcl*b*i<cd 
DNA scqwncin^ to examine sequence- 
r-.i^ltod sites (STSs) di5tTihiiri:cl nereis thr. 
bi.im:*n pcncmc. STSs arc >hcirT (!cnom{c 
<cqi.wncej thnt c^n he amplified fn>m DNA 
snmpK;s by me^nii i»f >» CtM'resp'.Wini: polv- 
mcra>c ch;«in rcactinn (PC^R) i\3S»>. Fnun 
aini>nc 24.568 STSs used in rhe <on?m»c- 
tM.»n nf A phv*io»l map c4' the bmnan pc- 
m^mc 7» rlie WhitrhcHtt fnstitute fo» Bio- 
mcdicnl Rcscurch/MIT Ccr^ier fw Genome 
Research {6. 7). an initial cullecilon fif 
1139 SrSs clviwen (8) These STS.S 
contained a totnl rtf 279 kb ^f Ki:nt'»mic 
sequence (9). with ime-third ffoni nindt>m 
iienon»ic. tv^uencc jth) tw*'»-thir(.H fr<w>i 3'- 
ends of expressed sc<.v.wnce iaj» (V-FSTs) 
and rrinutrily representing untran^Iatal re- 
ginns of aencs- Each STS Amplified 
from four sampU'S (10): three individual 
sample? <knd a pool uf 10 individuals <i here- 
by permitting allele fteqweneics to be esti- 
mated amon^ 20 chromosomes). The fCR 
products were subjected to sin^lc-p^ DNA 
sequencing based on fluorescent-dye prim- 
ers and j»ei etcccrnphoresis: sequence traces 
were compared by a computer program fol- 
lowed by visual inspection (i/). Candidate 
SNPs wete declared when vtfo alleles were 
seen amon^ the three individuals, with both 
alleles present at a frequency greater dian 
30% in the pooled sample. The term "can- 
didate SNP* is used because a subset of such 
appRKnt polymorphisms turn out to be se- 
quenciiiig artt&co. as discu^d below. 

The Survey identified 279 car^idate 
SNFS, dlHrib»ived across 239 of the STSs. 
This c w i esp oncb to a rate trf one SNP per 
lOOl base pairs (bp) jereened' and an ob- 
served nucleotide hctemry{50sity of H « 
3.96 X 10^* (Table I). Eicprescd sequences 
{3'-ESTs) showed a bwer polymoiphiimt 
rate than random gerusmic sequence (with 
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the difference fatting >\«t ^Jwit of siart»T^\ 
$lgT*ifiGifK:« at P =3 0.057. one-sided), con- 
sistent with peAter conscratm tvithin go^ic 
sequences. The ratio of oaraitions to tam^ 
vetfions was 2:1. AhhougK the dtnudeo- 
tide CpC nuOccs up only about 2% of the 
ttqucncc surveyed, nearly 25% of die SNPf 
occuncd at sudK sites VfitK the substitution 
almost always being C*^T. Cyiosine residue? 
withoi CfO dimideocldcii jot the tnost mu* 
tftble titcf within dte human Renome. be- 
CQit^c most arc mechylAtcd and can spontA- 
neoa^Vv dc&mli\Atc to y\M » thymuSine tts- 
iduc (12). In addition to the Mngk4>a3c 
■lubMicutions, 23 tnflCTnanHklebon polytnof- 
f^isms were also found (whh aU but eight 
involving $in^e base); corresponding co a 
fre^jcncy of one pcf 12 kb surveyed. 

Cel-bftfed resequencing was sattsfaceo* 
TV for the iiiiiiat sctten. but we sought a 
more streamlined approach for a larjscef 
sealc SNP idenrirtcxtion. One »uch ap- 
proach involve? hybTidi^»Ri<Tn to hi^h- 
dciuity DNA probe arrny.i (13), Such 
*'ONA chip*** can be produced with paral- 
lel light-i^iiccted chii'mistrv to .^yiuhcsirc 
specified olicanuctcotide prt'thcs tovglent- 
ly bound Ar df^fined locaiion* on a ^Uss 
surface *w '*cKip'* (14). A tweet DMA !<• 



qiuence of length L con be screened for « 
polymon^tsm by hybriduing a biorin-la- 
beled sample co a vaiiant detector array 
(VDA) of <lTC 8t (Fig. 1). For each position 
on both strands, the amy has four 25-nucle- 
otidc oli]E{Oti^ probc5 complementary co the 
fcquence centred at the position. The (bur 
dvflfcr only in the ctntial ( 1 3th) position 
is subff ituted hy CAch of the four nucleotides. 
Homox y j^otes (AA) few the expected i^- 
quence should hybridize more stccmjtly to the 
perfectly complementary probe than to the 
three pnk>t% contaixting a central mistruitch. 
The presence of an SNP would be expected 
to give fist to a different hybridizatlotv pac- 
(cm. with homorygotes (BB) showing strong 
hybriditatioo. to An alternative base Mid hec- 
efQ2ygM»& (AB). showing strong hybtidlta- 
tkm to two probes. The VDA thus sijviak 
the pttxnce of ^ sequence vairiatltm (by » 
charxge in the hyhrida7t{on pattern) arul» (r>. 
many cases, ir^c^tc? the nature of the 
ch,ingc (by a gain of Signal at a specific 
mismatch pn-jhe). VDAs have been used (or 
mutation detection oi snuO, well -studied 
DhJA tntpcts [juch m 3^7 bp from the hr>- 
ftum irott^utvvl<ftci<,Ticv viruv-l geiHunc, 3 5 
kb fn>m rhc breast c»tH:cr-n!WCi.'»ted 
QUilAt ecnc, and 16,6 kb from the hiimai> 



mUochondtion(l3, 15)1 in Wge numbers of 
samplo. In this setting, che normal hybrid- 
itatton pattern can be characterized wirh J 
precision and single-base substitutions de* § 
teaed widt high accuracy. 

In this project, we used VDAs In a large- 
scale survey. A total of i6J25 STSs covet- 
isMt Z Mb <yf Kutnan DNA wete selected, 
with one-thJrd from rarKlom ^^enomic sc- ^ 
quence arid cwo-dxtrds from 3^£STs. The 
survey used 149 distirtct chip designs, e^h S 
containing 150.000 to 3COfX30 features, fi 
The STSs wcie exammed. in seven indi- 
vkbaU. representing abmit 14 Mb of 
g)ciKxnlc sequettce. For each chip, the cot- 
rcspcTnding STSs were ampfified from an 
individual, pooled together, labeled with p. 
bioriAr hybridized, and scaif\ed ( 16), and the 
resulting hybriditarion ps Items were com- 
pQued by a compfviier proeram followed fay 
visual irupeetion (17). At each ptf^ition. 
samples were cljusificd as homoiygous for 
the ejrpectcd sequence, homniyg^iuj for an 
atterrwrivc sequence, or hetcroiygoiis. 

A coUcccion of 2748 cuwiicfore SNP^ 
were identified. conespct\ding to flt rate c*f 
one per 721 bp lurveyed j^rid observed 
nublirfUiJe hefen>:y(,-iHiry of 4.5d >f 10*^ 
(Table 1 ). The number of STSs ciinr.*in»ng 
SNP? WSJ 229P. -n^c SNFj had a mcyn 
Kctcroiygosity ivf .13%, with the minor >dlclc 
havirvs a mi:»n frequency of 25^>- SNPs 
were fownd k?* o<t«n in J'-ESTs ihon in 
randiximcnomic scquervcc {P < 0.023, one- 
siilcd). consistent with j^reater con.4rr^inr in 
pcnlc recion*. 

Tlie nucleotide hcreft.nypttsity rate w-m 
truti$:!npui£l\ftMc frt»m the cAtimart: c»b* 
^<»ined from goUb^^d sequencing (P > 
0-12. twv^idcd test), as was the rptio of 
cainsiriivi» to tiansvef*i*w> <ind the ptopor- 
cio»> iif SNP* occurring «t CpC dinucletv 
twic-s. SNPs were dtrrccted at a higher fro 
m»cncy in the chip-based surxrey because 
mi>re sampie< v^c surveyed (^evcn versus 
three iirkdividuals). The observed increase of 
5a.S% (1/721 ventti J/JOOI) agwed clc5«;lv 



n^. 1- SNP saoaning on 
cWps. !A» Smad portion of a 
VDA for an ST3 MybrtcUzod 
with Ihe G)cpacte4 totget se- 
quence. Chip (Mtures \r\ 
each column ar^ conr\pie- 
martwy to succa^swe over- 
»po»ng 2S*nudaor»ei« otigo- 
mer !>il>$e«9;cncan. vwttr\ the 
cermai m$9 substrtuTM fyf 
A. C . G. or T in tha four rows, 
variattons from ^ enpoci- 
ed sfsqusncc can dSvdfiy t)e 
daiacieo Oy examinaTfOn of trtc mosr intense stgnai in each eoh^n. (B\ The soma VDA nvn? hyt^r4dI2ad 
with soqt/ence coruafninq an SNP (A-*0 at position i g. The hybrldb^tion vgroA ts now .^tronQer at an 
ARemQitva Mse at this posrtion. it is also lAtaeker at tha surroundng poshiorvi (for exampla. post^cns ^^ 
to 1 8 artd 20 IQ because OfObes at tt^Ase positions am oe&tgrvcd to ha comptementary to the A ateto 
at the SNP ano mymatch wHh the C a&aia. 
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Tal)l« 1. Results oi SNP screening. 
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70% 
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with cxpcctwtit?n under cla«sicAt ptiput^iion 
rtcncttc theory jW. Tlii> Kjtolr impli- 
cacinn^ the cNoici: o< »amptc m':c an 
SNF JUiTvcy (19). 

Wc osfim^tct^ the error n^icr* in r^r prl- 
b^i^^^il not' cKip-bi»?«d <nnfcy>. The Msc- 
pf.isirivc rate wma estim?nxil by carefully aw- 
firminp candidarc SNPs foui^d in each ntr* 
vev by U!:mc ttviuiugh muUipass sequenciiif* 
(20): 1 2% 0/ 220 candidarc SNPit A>und in 
the cKlp-batcd Mtrvcy at\d 16% of 120 cai\- 
didntc SNPs fourvd by sinplc-pnsj j£cl-b95cd 
wqueocinjp wi;rc falje pml fives. The 
nc^^•l'^v•»: r,<ic «a* estimdtod by ccnsiderinp 
#1 *ub<tct jif STS^ »hat had been included in 
Kith snTvi:y%: tHcsc STSs yicUlcd 55 SNfjt 
(mH cnfcfuUy confirmed to chm'uwc f^ilsi: 
pt-tsitrvv?). c4 which eight (15%) wtrrc 
misled hy s»n<ile-p;»s.< p:I-K»Hxi rc¥Ciiucnc- 
inj: .W stvtm (U%) were miised by ihc 
chtp-HVMtil survey. Mnny of the crnirs wen^ 
due rn mndiMn fitctjiri. in ih.tr they wcrv 
cUmiiWted simply hy rcxn:i*tm|: the oripiTvat 
experiment. Iltwevcr, muxm: were rcpnntuc- 
ihW arrifncfjt dia^ K* eliminated »nity 
hy chimuin^: I he dctectit»n prt>titcol (for cx- 
Hrtipk*. hy iisinj: flyu tcnninntw* i^chcr iWaw 
dyv primtf?s in ireUbased i^ucncini:). Th^ 
iicl-Kirfd scgucncinc Ai\d chip-htH'd onrily- 
$ts K;td ^imtlar rAvc« of neeMntcY^with a 
fali» piwtive atui ncRiUivc bcinc fmind 
rmiL'hly iivcry 5000 to 10.000 k^si>, i.ir 
10*X* of the true SNP frein.K^ncy. The 
riccuracv larpely refleei.% iht: p;micular inv 
r'emcntntion of ihe icchn(»lt»pic» in it hit+i- 
rhrouKhpur 5errin.G and ctv.iU\ he mcrc^jcd 
at the expense of assny nptimii;iri*m, 

AithiHiKh the iu?i» wirvcys yielded ctm\- 
parahlc acciirAcy, thv jfuney based on 
Vl\^^ reqitircd considerahly le<s 1;«bor.^t<.iry 
work than pel-based r£5£qiiefu;tnte. Both ap- 
pioaches required amp^ifymp target loci. 
The ^l-hascd jfpproach then required » 
sequencing rcactkm and etectTo^oresi« on 
each individual l.ocu^ whereas the chip- 
based approach altowed cargeu cmlifig 30 
kb to 1)e pooled into a single labeling rciic- 
tiott and bybndized (21 ). 

TKe SNP collection hem the two sut- 
vcys was nifq|>iementcd by two directed ap- 
proaches based cm public databases. FIrut, 
we collected reports from the titetatute oif 
common variants in tj^ene cbdtr\g regionv- 
We were able to oonfixm 120 of 143 cases 
rcfccd by virtue of detecting two alleles in 
our screening panei« dre remaifuJer nay be 
true polymctfphtsms but simply motKymor- 
phic in the itidivtduab tested. Second, dw 
GenEank dataKve contains multiple en. 
tries for some ESTs. Such entries were com* 
pared to tdcnttfy sm^le-nucieoeide differ- 
ences. which. mipht reflecc either common 
polymorphisms w sequencing errors in sln- 
gle-patt EST .«cqiicncing. We ccited 200 
such apparent differences and confirmed 



Fig. 2. A portion of the SNP oereite map (shovwtna 
human chromosome 1). The tuS map is ovaSablc 
on the Whitehead institute Web site fKww. 
g«vwe,*w.mrt.e*iu). Positions are based on o«- 
rwito cli3iancea in centimof9on&. Genet?; posi- 
tions of SKPa vMorf^ trtterred by locaiizinet them 
relative io IrameMtfortt maikers by RH mapping and 
then interpolating distances from conttrays (on the 
RH mai^ to centffnorgans (on the ^insttc map). 
Ffamewor!< marker n^nes are givsn in full. SNP 
ndmes Bre named with the prefu wviAF (for exOTh 
pte, WiAF-1 7). but ttie prcftx is droppod and orUy 
tne niimt?er is shown in the ffgiire. 



ihc pxvMjnCL- 1/ t»n SNP in 94 case^- These 
iwii JtrtttTid ;*prroachcs thus yielded an 
;«kliciona) 214 SNP5- 

Thc projea has thus idenrificd 3241 
candidate SNPs u* dare. OmfionHtton (22) 
hu% .«j f;^r btcn (obtained for 1477 SNl^ and 
«> cxptxted to yield --2900 true SNPsl All 
tnfonviation about the i>NPs hA* Iwcn dc- 
Ptvsiitcd tin rhc Whiicbeml/MIT Center tor 
Gerw^mc Re.f&trch Web sire {www.pf.ni.ime. 
wi.mir.CiHi) :ind will be updated wich results 
t)f :tdJii)fHtal Mirx'cys and confirmatiinn 
iws. Tbf iofnrmation i>" also being dqx>:i- 
itcd in the CjenBank darflK^sc. 

h>r SNFv to ht ii«:fiil in human penetic 
Ktudie^, tlw.y must be x^semhled Into nupK 
showirtp their chromosomal Incartiin. T<> 
CTcntc n third-gcocwriim mjtp h)i<cd on 
SNP5, we used whoU:-Kv.nnm.c r?d»ation -hy- 
brid (RH) in:*pp|nx: (6, 7. 2^), which infcR 
the pdMtiiin iif it>ci bn^d '*n co-cetcntiixi in 
a panel >H#m.m*tin-h.im5ter cell lines; it 
hi^ hcciinic a prima ry method for conttn«CN 
ini! mi»ps i^' rh*: hvman p:nc»me (6, 7). 

The cuTTct>t Rl I map of d^e humi^n 
ncmc t& ancKoroil by » ^^caffbld of 1036 
generic marfcct? an earlier Rer\etic m^p 
cnn>i.n\ng of simple sequence length poly- 
morphisms (SSLPs) (7). SNP* can be inte- 
grated wUh respect co the eitrUcr genetic 
ttiap by determintng their portion on the 
RH inap. We have localiacd 1880 STSs, 
containing 222? of the 3241 candidate 
SNR, on chc RH map and thereby fdattve 
to the hufnan genetic map (F^. 2 and Table 
2). SNPs are not evenly dixtributed among 
chromosomes or widtin chromofomes be- 
cause mosc were derived from ESTs, which 
are known to hBve an uneven dtsotfaution 
(6, 7). SNP-contalning STSs are ptcsenr at 
a mean apadng of 2.0 centtmorgans (cM) 
across the genome (24), and the map con- 
taim 58 intend Rreatct than 10 cM. The 
get\ctic distances on the map must be re- 
garded as ■pproximate because they ate 
based on ineerpoUitfon from distances in the 
RH msp. It will be desirable to reesiimatc 
thefte distances on the basis of direct linkage 
analysis in the CEPH fiimdtcs. as hi|^- 
thnnighpuc genotyping fbc the complete 
SNP collection becomes fcasiMi^ 
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the Ajimfilc preparation required to gcncK 
type hrgc numben of SNPs, ft* ftquitcd co 
perform a ftcnome scan. We developed a 
pTOtocoS based on mvltiptex PGR in which 
pnnvct pBif? from manv diffemt loci 9fc 
combined in a stngk reaccton (26) > Al* 
rhcNT^ it 19 typically dtfTtculr to combine 
nurny PCR asiays, the oppro^K worked 
well for our SNP assays: 92% of the 550 toci 
passed the detection test when amptifica* 
vion was perfomed in 24 ten of "-23 toci; 
90% passed when amplified in 12 sets o( 
-46 loci; 85% pa»ed when amplified in 6 
sets of --92 hd; and 50% passed when 
amplified in a single set of 556 tod. The 
Success appears to have resulted from a 
combir\atlon offactots, tncludinn ihe PirM 
fixe of the ampliftcation tar^ts, optimisa- 
tion of ^nif^ification conditions, and the 
presence of the constant sequence at the 
5*-crxh of the primers (27). It may be pos« 
S'ble CO salvage the unsuccessful assays by 
Rfouping them into addition*! moltiplcx 
sets i^r hy Tuvtcsijmin^ the ^K^ys. 

Mulriplex nmplifto>ti«r>.of 5Cts of 46 Inci 
W3J used in iuh^cqitcnt cxperimcnrs hcc^uac 
it decrcrwixl rhe niimhcr of n-ictiorts by a 
factor Iff 46 while Alloivini; the vnst majt^iry 
15 1 2/55^) of lix* frt he tisKiycJ. The pnicc- 
d»rc WAS fiiTther vcsrvv< in }9 itnlividiK^Ls 
;»niJ W31S <juiK ct»iHi*w»t: 96'X» of the Si 2 
loci could K: si.icccwfullv avxl in 1.00% of 
individttAU tc»(cd and rh^ a-m;nnJcr in 
nearly ill. I indi^'iJu^its. 

Wc ncjtf Jevcltipcd a scmvrypini^ m)|H>. 
rithm VoT cin:\i SN?- L»v:i victc. declared to 
piis* M cl*tsf«r tcit if the hybridisatii^n prti • 



Tabte a. ChfOmoromal disvlbution of 9enetic markers. 





No. of fr^nno^MVfc msirkors 


Genedc <fi»<krtce in 


No of 
SNPs 


No. of 
STSs 


Awg. distanr* 


No. Of 


Civomtifome 


used from Sfff»4 -marker 


cM. on Genethon 




imcrvaTs 




Gen^hon <|(w^^ noap 


Qfcne^io map 


STSalcM) 


>10CM 


1 


8A 


:?g3 


236 


201 










277 


177 




1.9 


2 


3 


78 


233 


160 


133 


t.e 


I 


4 




213 


96 


67 


2d 


4 


5 


to . 


19a 


66 


72 


^.6 


d 


6 


71 


201 


1156 


118 


1.7 


4 


r 


37 


184 


M9 


94 


2.0 


1 


8 


45 


166 


135 


108 


1.5 


3 


9 


40 


167 


106 


38 


V9 


3 


10 


53 


102 


8S 


76 


23 


1 


n 


56 


156 


105 


92 


1.7 


1 


12 


43 


^m 


106 


91 


1.9 


3 


13 


23 


lis 


57 


45 


2.6 


>v 


U 


31 


120 


83 


64 


20 


3 


15 




110 


76 


67 


1.6 


0 


\6 




131 


70 


64 


2.0 


0 


17 




129 


94 


57 


1.5 


2 


18 


29 


17A 


52 


43 


2.0 


4 


\^ 




110 


58 


S3 


2.1 


7 


JK) 


34 


97 


56 


49 


2.0 


2 


3!1 


18 


60 


29 


26 




2 


2;! 


1? 


56 


31 


26 


2.1 


1 


X 

lot^ 


35 
10315 


idi 
3609 


mi 


—.43 
1680 


4.4 

2.0 


_6 
58 



Wc next develvpcd an efficient method 
for lai^-scalt gepotypine of SNFs based (m 
extending the* use of DNA chips from SNP 
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Fig. 3* Genotyping cNpa. (A) Schematic dlacfr9rn 
of gcnoiypin^ Array tor an SNP. consisting of two 
vOAs 10 study ieven nudcoDdas ceftt«rert around 
th« SNP. The top arid bottom arrays aro oesigned 
to be contpiementary to plielic seqMoriC(»A 
cortpining A ar^d C, rwpectlvaJy Probe? oerfoct V 
matching the A and C steles afo shown in gray 
and biAcK. respectlvaV- A genotyptng arr«y tor the 
comptemanrary strand waa eiso used txit i? not 
shown. (B) w^'dization iMontil for a gonoivpto^ 
artay prcboo with eampiaa (ro»7i \hfcc indMOuais 
with respecTlva ctenotypcs AA, AC. anc* CC, 



dijcovery co SNP 9cni.nypir^ (5). We syn- 
thesized genocyptng chips containing 
'*genoiyptng arrays* for each SNP to be 
tested. Each genotypinfi array constsa of 
sh.o« VDAs WTwpoisding to iKc two 
altcnuitive alleles (Fig. 3). The preserve of 
»n allele should be reflixncd in stnmg hy- 
bridization to the corresponding resequenc- 
tng array. PCR assays were desiicned fbr the 
region containing each SNP (25), with the 
goal of being robust and mutually compat- 
tble: the ampliftcation tatgets wete all small 
f typically, a few nucleotides around the 
polymoi^.ic $ite}, the primers all had sun- 
ilat Ctilculatcd melting tempcratutes, and 
con$tant sequences were atidcd to the 5'- 
cnds of tht foramrd and leversa primers to ' 
facilitate hatch labeling of pooled PCR 
products. assay was tcsced to ensure 
dut it an^lirted a single fragment 6rom 
gerwmic KNA. 

The most complex gcnotyping chip 
tested contained genorypinR arrays for 558 
CAridtdatc SNP* idctxtificd in rfse chip* 
based survey. In i tint [y. the 5S8 Ittci were 
separarely amplified, jv>«^lc<]. labeled* i»nd 
hyhritlticd to rhii chip. To dertrfminc 
whether each t»»ciw c^iutd be rcli;»hly n;i,id, 
wc defined a fitrtrnl dc tectum rvsv loci 
p;>?5cd it for ir.icK of threv indiwiJuVJls 
reared, the ex petted ON A sc<4ucrt.cc could 
be succei<fijlly read on hnth ifranJs for 
one or hi)rh a tides. In all, 98% of the li.»ci 
passed rhU deieciion tc^^t (with chtr tr- 
nitiininK 2% f>iilinK as ft rc^At of weak 
Kyl>ndiiat»»>i^ t>f c-T05i-hyhTnUi^tionV 

We ncxr, •u.mcht to Jec.rca.se suhsfiinrrilly 
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tcmj «en in h tew oi^9 indivtdti^aU fell 
inu\ i1i<rinct clt»tc«, ciirnrspondir*^ to the 
|V»i«ifi?c gcnptypej (28). Thvsc ciu*ier» 
Ct'tiild tbcn be U5ed uv as9i{!n ^'.n(>rypt^ fivr 
fiirflicr wmrWs (29). 

Th« clvuuT WM was appliwJ to rhc "^500 
candidate SNPs that wt:»rkcd well under 
multiplex iim^ltftcaMctn condftionst 75% 
rwscd the cluster test, nnd careful rw- 
qisencin^ dcmvnstnttd rhnr such loci 
u-cre ntiv pfMymorphisTn?. Th« clustct tcsi 
tKuit pTovidu TetiAhK: confirmation of an 
SNP, The- rcmaii^in^ 15% fftikd the cluMcr 
lesi. :itkI roeouencing rcvented chat h;»lf 
were false p05»fi^ »n the SNP screen i^inJ 
hult were trtie fmlyfru^fphitms ^u,•i^h iht 
pivNT dMCfinimiiti'Xj on the chip typtcnUy 
due tn ont- :*Ucle hyhridtiini; more weakly 
ihi^n ihv tiiher). 88% of the carKl*- 

Ji»tc SNPs pnivird t»j be true pwlymor- 
phiMTW. flt\d R6% of true SNPs pat-HNcd the 
cluster tvst. 

Tt\ tc^t the iepr*»d«»:ih»Uiv hnd accwr^ty 
t»f rKc ^icnwypinj? mtrhix^. wc jcemnypcii a 
^ei of 91 Inci (p;*?p*inp the cluster tot) in 
three indi villi i»U by pcrftHininii cHip-ha«xJ 
prnotvpti»K on separate itcc^isirm^ over .i 
2-mtm»h pcri'.^. The coTrcct genorypcv 
were. indcpcmiciMJy ifcrcrmincd ^y fhtyr. 
f Mjph pcKhftiicd Tuscqucncing. TTh*. pcnotyp- 
inn-chip :*s.My ,iisiffned rt ^.^envtype in 93% 
rtf cMscsi (I6l>/163A)4 this assicrmicnr 
provt:il corrcci in 999% (1611/1^13} of 
fhvsr Gifcj. The l<>c» were obo iseni»rypcil m 
two ct^ptete CEPH fstmttiei TJic gcno- 
fypcs wvrc not u\t)Lpcru^ent]\ e«wiftrmcil 
Ku ihcy verr fiiUy coriM^tenr w»rh mende- 
Uan ^Rre^tii>n. 

For SNPs pa55tnji the clyatcr te«» highly 
Accifrpte jEenotypcs cvuld ihw be (.obtained 
with the $implc dcst^ u$cd here. For the 
remainins SNpj (14%), similiiT accuracy 
can likely be c^tained bur may require tip- 
Timimicm of the itcnotyping amy design, 
depending on the locus (as shown tn (5)). 

The SNP survey* provide about 
human gcnetie diventty. Two cbttkal mea- 
«urc« of dfvertity (30) are H. the average 
hcteroiygonty pes nucleoridc. aiul IC, ^ 
proportion of sites hatbohng a vaiiation. H 
does not depend on sample sizet whereas K 
increasci with the number of genomes sur- 
veyed. For 8 pc^fttion at equfltbrtum, the. 
AcutTTiI theory of cvohitton relates H and K 
to the clasRcsl popubiton gmttk parame- 
tet e - 4N^|t, where N, is the elective 
population sitt mi |i is die mutation nte 
per micteocide. (6 can be thou^t of 9$ 
twice the number of new mutations per 
gencfation ariilttg tn a population with siic 
N..) Speclftcally, H e and K 0 |r» + 
r ' 3-' + . (n - 1)-'). piovfdcd 
that 6 is small From these equations, one 
can orimate B based on Hot K. 

The human population is not at equi- 



librium, hut Mthtf underwent a Mf id 
ulauiin expansion in the 100.000 to 
200,000 ye^rs- Such popoliirton c?(plo?ion» 
rrtid to suppress the cffecr> tj ^nctic drift 
•>nd thu* preserve rhc Jiuribution of com- 
mon alleles »nd tht volue of 6. Aca^rd- 
ingly, the v^Inc of 6 is re levant m the 
Hnt-.tstnl human po|M»lati<,ifi before its re- 
cent expansion. 

TKc ffnn oti mates nf fl derived from l( 
and K foi the two survey); arc all touchly fl 

4 X iO"^ fTab)c 1). A>5uminR a muta- 
tion frequency i%f >l 10"* to 10"**, rKU 
would sui^esi an effective population 5ii:e nf 
N^ 10'' to 10^, which seems reasiMwhli* 
for the anccf tntl popularion pr^cdtn>! the 
cxplosiim in the bsi lOO.OM years (31). 
Strictly spcakinc. these estimate* apply only 
to the turopc*in population, fmm which 3II 
sample?; were drawtv H^iwcvtT, a prelimi- 
nary purvey of a more diverse sample of )J 
ind^vidiiah rcprcsernin|> all major racbtl 
^Hjps yielded a vi^hx* '^f 9 that is imly }0% 
]ar^r <26), ctm.iistcni with rhc that 
humai> v^rtation occurs primarily within 
rather th;in between rrtci-*! ^aups (jl2). 

Thi: rcjioiirces rcpiirtwl here reprcsenr 
only a first srep N>w:»rd a dea« SNP m,^p ♦.•f 
the hiioiAn j2cn<^me. The ixnttic map 
shiwjld already be \»cU*\ fnr fiimt)Y-ba.^ed 
r*r>kr»ce $rudie.<, ^vcn the avera^ tpiitcinu 
(2 cM^ and nvcrige hetcvotygij^iry (34%) c«f 
the marker*, (The heteroiypirtiry ftpplic* to 
the Eun)pt*»n-dcrivod samples 5ttKh*cd here, 
bur a prr.liminary jurvey <»f ^180 i>f the 
i^NP^shiiws rh^t mtw are a}^^ polymorphic 
in tirhcr )^vXiv»s.) h. sriU n-mnirvs to dtA cltip 
rt suitable pcniMypirw system. Soch *< :\ 
2000-SNP ^*m>typin(i chip. 

Larj^»5cale screening for human vart^- 
I utn is ckarly feoiihlc. Someday it may K:- 
come posstiile to screen entire human ge- 
nomes. )n the nearer terai, a key goal will 
be to exter^ SNP dixovety tn d:ic piotein 
coding regions of all human gerKS (rou^Ehly 
120 Mb ^ sequence* only about 40 times 
more than the current audy) in order to 
eataiog the common variimts that may ca- 
plain suseepxibiUty to common, genetic 
traits and diseaacs (i>. 
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Bite 6'-TGTAAAACGACCeCCAGr-3') ai its 5'*«nd. 
T^o rmiiiftfj PCK oroOvu* ?^ipaco to oya- 
prirno' Mfpiwvfriq (33^, wilh pro* rtr. d«cctcd 01 1 an 
/43077 90 Amil\ ftioresccfica sequtnce antector. 
(V);.f««c sdquenc* va^iatiw VkCr^ Ostaasd ty *^ 
A8t SM)>xnce Nawgstor fxifttNQfe pacK;^. «tiich 
sugtjms potCTiUainatsmzygMPSOy fft9tilifyiridnu^ 
ande poWrlc)fj$ oi ^''^ich e sseordmy pTiC^ 
Art B Ctad ihfwsnoc (50%). Sue** •ptwsn varisVons 
war* rtwn vl?>L:«)y tnoofdad to cnmoarc ihc oanams 
Avon among rne $cv^at(t>diwidu«ti. 

12. 0. M Goon« anc» M. K*r«ik. *«jfn. Ocnci. 85, 55 

1 3. M. Chtta $c)ew 274. Gi 0 (im): H J, Ko^^* 

.V. , NQtun M9si. 2. 753 (tP95), 
^ I^, ^. FoOOl af Sriter»re as** . 76? n99i>; *--C. 
Psast «r v.. f^u. Aetd. Sef. tj.M, 91 . 5022 
(199t). Ths Qfrmfi QOtieraiion of isrimoiogy anows 
tabficniion o< ^.?ft C'Ti t^f '..28 cm Mnr^i 0I 

4 "tej»n.ir^" ^ 20 by t ? |Mt* end comamiog 
> \0' coqicagit thft prone. 
15 J 6. Hacia «( «/.. Gonot. 14. 441 009$), 
$TSs warn sfr^xtKM wftn uieir corrasAorvIt^ 
prtm^as OttScrtbQd<^. ncR products intanded (ty 
hybrtAattoo iotr« BVTW c^ip ftypicajV IOO to 200 
STSs torn 9 tmn^ indMcM wara pootoo toosthor 
tor sobMquwtt bmcess>AQi. About 1 to 2 119 Of tn» 
poMO PGR product was piMd ivth Qtaqutck Du- 
riflti tf oft Kit (GlflQBn^ frs^'i^aritod virfth dMoxyiiborti^ 

«it?t tarminai ddOBQ^tucwDtfi^ tiansdyaso (Tdr, 
gb coBRL Ute Tecfy *e)oow t»» wrtteaaon wa* 
pOfHorreao aeusiftiji X) mo mnu!Mhffvr*8 ^viruc* 
aons. Tfta t agma i ii wiu n was pafftpned m a 40*(U 
i«8citart«ftnO.2un3af0N990l. 10 mM trts-aceuus 
(pH 75). 10 mM magnnlun Acetata. ar^d SO mM 
p ot » M tow> pcQtaio at 37*c tor 15 after wNcn 
ino rvction was stoppad Dy haar InactltQttort al 
98^ lor 19 rmi Tna tamM tmtoraae roactfon 
WW perfonnao ey Bddfr^ 1 » urns 0) TdT Pino 12-5 
iiM ttottvMgooATP (OuPvit hBi^ le the pracstf- 
lr^ora» c tttfMi< m pa.lxubanngfta!3y*C<ori nouf. 
■s^ «^ ftaat"ir\aG»M»Ngtm9e^ 
labiiad Mutpics mara hytyttfead to im efi^P as 
IM. Somptoa were donaiunsd at -^*C ior 5 to 0 
rn)n and o«M on toft for 2 10 5 n<dn . Obipr. vMf«8i«t 
r^Mtet vutth ex SSPET [0.9 M Naa 00 nM 
Ntf^jPO. 8 EDfTA (pH 7.41l a009% Tr^on 

X- 1 cor *S fnlri aner fiian Nybrtdbced wtth 0a de- 
ftaturao sa»npto h nyoneizatian buf^ pM Mrs- 
nwthytvnmortiunicMorida. iOrrMv(a-HCI(pH7.fl|. 
1 fvM EOTA 0.01H imon X-100; oemnQ sporm 
DMA nOO MSArQ, ano 200 pM cofttrol oeoomar] « 
44*C lor 15 houii eri a fons^aita at 4Or0m. Criips 
v«rewuhodinraatinrMw(th ixSSP6t. lOtmnas 
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€x SSp€T at 22*C, «nd •ojiyfO « foom ttm- 

eofl^ym (2 hom) (Motacutar Pieaoa? ra oeflty- 
oiod bouif» sonrt afetdviin m^M^ m fix 
SSPEn to 8 rrtn. Alter tw *em sjwwd. ine CN» 

chio w«$ (lotockxi oy 3 cottocei ci^ scanner 
(HP/AtfyrvtO^ « reoeUlon of 40 to 80 pfareU 
par iMiur* and a 5G0-nm Am. 
Cantft^a S^^3 waro kdOfttif>otf by ustno o combl* 

ftofv. As osth oosHo^ lha VDA contin otv) 'OX' 
p a c taif pniba {Beyra apcin dtfig to tfw aa t^ uan ca ftw v 
wn^ tfw cn^ «Mes Mciffru^ M ttim vancrtr 

sWom: «t u#ilch/ln 90ma tndMtfuat^ 3 pfObc 
9lii>a a gtitf iiy i^|r^tf^Bniho mppctao p*obc.T^ 

vacity iL frow tha oigM pR]htf a! postrtaft f jtour bus 
9uo^Mor« on botn straps) ki indMducf > end 
tookatf to pn»;tbnf /«t «vhMt ^ lectors a, taB MO 
miAbM dustsra, Tho itVid i^goi^t^ (mutgnf fric* 
wos ^Irntter Dui iDcusatf only on ttia aMpactto 
pmte tnd a vartam piooe M ft umo 
tfwi aR thraa variant protw^. Thu lourth •kjorf'tTn 
OlDolprint datactioni lookad tor tNi lo^n of 9b)n^ 
occuii at tnaexpacted proba» tnihe naigrmcrhoofl 
of on SNP 13). Tho elporTthm» heva dittmm 

(yjueWuttont. 
te. As oirwwfl ^^ yi>: K^y^. ttv- unwrton K «c 

LI ' t 2 ' t 3" -t . f^-l)-a«rt»rtnlsthft 
numofr o» iwnow* sumow. Tha pnaKwtiofi o* 

39-396 w*ian iho nurtiar rt (jnnomtf (f. lnr^3??o 
from 6 {ri the 9aH»9ad Mrv«y) 1o U (In rtw chip- 
oociMS iL<nK!v)- Tn'fi d^rces waR uMh tna obftaMccf 

m.rri of ih« common tOfidtidi. Tno s^smpiEi ^ucr uf 
14 rvK 3 .so% cncncc or i'(ciocnn$ an «s«^ iwih a 
fraqtency d 5*». Do«*)*nn tn« prwt^i 0» varettl 

tfV^ tortrtub (br < Tha targar sarripte Riif! vvil Tfinrt ro 
ZflL wme fesCtmiCrtCOCi Or> bOin £tr«riUbi with tt^ 

thai lor>^ STSft can ba anaty^rirt. whi>t#4%^()CHXtf-^ 
v>!Q\«:ncvia it Mei) to abooi GOO t^. n tn ititj^ 
SO^^^Ot) to uSd Ca<Mr POT prtxXuas fo arttfy/r .n 
rf!Qtnn. Th« o.r^nt sOiTy M not taka atfvomaga of 
thH^ fMmre Dec3iiK sim STSs ittaady 

sv^ihhtn from oi.if ptcv*MO {9. 7). 
2?. Coofrmft^ton was inetelV t - ier fon i wtf by mulitpwtR 
quencing W p- r%tnrsMtf i^cff^ oo»^ m» 

ClUSVOrii%^ ta»t on cencxyplng rii»pf. 

23. M. ft. Waiter « a*.. Atoftjre Gw. ZZ ft304|. 

24. TTw towen ttenafty cxxifff. cm (svomowm; K. wtiich 
ha? mc lowost dtnaity o» STSa and Wfbfcn wrw 
nsrmrwri «i rvMC' lotol j^enoitMs in as mucn (w ttw> 
fcraeriJng (xwsl iry^tuooa (hraa fnafiBi. 

25 Per inch SNP. K*> pr>nMf» vwra cho&an w*ih rt>* 
FRMER unftwnro p;>diaoe (S» lo ctaaly ftanli m» 
potymarpNc tjacf ?n<J to h9t«c ft pr4<Sctad ntatttnQ 
tan^arattira of STC Hf^Msn drul revone primers 
»Qro synlhasttarf wftft (he r r ano T3 prornotur uGoA 
(5- rAATACGACTCACTATAOOOAOA-;** *W 3'- 
AATTAACLCTCACTAAAfiftGAGA.3 ) a Thcl^ rC- 
r-OecWO 3* «r«te. EfiCh PCP prtmer p^tlr wof; intfi- 
vtdinOy vcmna to ctetat^runa if it pnirltjrfirl ^ fhqK 
ciacr frfiOfflcm <^^lty ogitfo^tjat 9t*ctn:oh9rt8U 
and ottMirri-Drornidc staininB. a> de^crjhwl 
PCfl asssya pa%«inq rv? tcsi woni Kvlhar daaaM 
aa bNno^'rong or ¥«ek itennrro^ k> ino vic^of tnv 
Wipmaftt pmrftjccia FVimer pairs wwi* .qrpvPCtf Wo 
i«*(p*w scte. laah the stti a^m to consist 0* 

•«>«r ;»\MiQ BuayKOT wuuh asWT-. 
if Mktftipex PCfl was perCmnwvJ Oy oting mi tt\0^, 
pfHiar pai»^ jn o sii{)»; raacdon. Spo.:ak;aV mi tti- 
pl«K PCA necc(rMt!> vfore prftOfniaO In .i vol- 



(jpiaoontl^*^ 1C»n0o»hum(mganom!eOMA,O.1 
to 0.2 »iM or oach orim. t liCt ef An^Taq Gotd 
(PaMrvBtw), 1 mM dBonynudiGCIdBtflphoBPhiftaa 

MgCtj andOO0l%gBtalK'n »wumL y ij a j iyweBpar- 

!om«6 or » TewiO Haswifch^ wen w*tt <^ 
turuioA fit 98*C tor 1 0 nwf (WtMfoa oy 30 cycias or 

dcvytrirailnn si tor 30 primsv mmvJtng at 

55^ tor 2 fnen. and prtner «t^Tenstoo at iffi'C tor 2 
mbi. Altar 30 Cydoa. 6 fma* flxtonsion rstctun wro 
CSfWoU at flA*C lor 5 r>*v flacavaa ra^)rV) 
POI produett amas. tt tdinaeasstry to 
(fagrnom ihom [«s ^ dons ror 018 STSs m Tho SNP 

jL i H ar). Tha PCn pr o oucia «nr* ihan !abeM «^ 
tMin ft« a atanoanj PCR raoctton. by uatng and 
T3 pitmara mSh tkm tiMa at tmlr 5*<«nda. fhn 
mocten wM ptrcomiod vrIA 1 iJ of lampteta Of^lA. 
0.1 toO.3 ^ ObM primer. 1 una of AmptfToq GoM 
^^MAfVSmfif). lOOtkMOfTF^ ^OmMtri>HOtOH 

tin. TharmocycfinQ wast pailufi'i^ad *Mitt tnMsS dons* 
tunfloncBQe*C«os lO^todonwaotiy »cyCtaaat 
donofiuraiian at 99*C lor 30 s. pnmor annasmg at 
SSXfDT f mtt anopfWgf wtergygt 72*CfOr 1 
min. AAar 25 cjclaa. a final eidaiialar^ racscikin 
earnad out at 77*C for 5 min. Tha PGR proifcicta 
rromme uain*fi<nu9tpi9x raacttone for ^ incnadtAt 
««er9 lAen oocteo lo^ncr. Ono-rantn of the pooied 
i9mD(« wyz denatured ano used tor dvp iivbrtotz»> 
tmn. Chte* wipr* n^ortrwrt. vxi^^neo. stanod ana 

«¥jrw«J, a? ahn» (TfiV 
27. 0. O. Wang unpubfiihad obumattortf.. 

reww wco locu? on nw owis of tha ny- 

bf ktotan fMutTR QbKMVAd In a twr* popUat'on of 3? 
tnc8v)duiit». The proportM of the two porter, 
proioca in inc i' tt\ -.wptv*;!. Owtad -t^, and tt^^ 
(*Wt r^, * %tt « II *erc C«iflWtad ossantrafly 1:^ 
comtvt^ the rjh^nrvrrr hvhridUKrtiOrt dpnaf (o if» 
expeciart agnah lor rtw twrt VDa^. The wiluca V;^^ 
tar ma 35 tndivickiati tt? in rhp inr^fViii )0. i) 9iKi 
shotid «deet)y dusfor rwar 0, 0.ff. .vid ) 0. i3t/l otncr 
^ttGrf IS itiiyht uccu tsecauaa 01 cSffarviOFf- In Hy* 
ijfidS<^Ui>n inii:iv;ily itetweun tha two aeaia.n. Thft vnf- 
lk<n wcro On^noay CKClOf^ 03\ wHh rhr MQO- 



The cc'lliil^fr pfOpcfCtCS of iKurnns; «c 
tnitLlultitx-A.! hi 3 auttvKet of extrinsic »i{tr\!»U. 
incl\Mji.T^ ^n:«ptic acclvky, riuMtijr.rophtC 

rfti:ttirH, wd KiYfrnnne?. TKcsc signaling sys- 

rtTtirtjt i«1ntT the inmtccllular ci mccnir^tton^ 
of Mictmd mcs5en9cr$ swih cotbum ami 
cyclic mjclcoritiis. anil rKc*« imall myW- 



ECLUS procedw o> ma SaS asriwim paatage 
tSAS friatttund^ A rnaMtrrun ol thraa nonovartapplri^ 
ctusws M3PQinnstoa. dQflneooy points ^^<6A < ntin. 
inun sepMQan o( ai2. A tocus talad the ctum^ 
tost if a mo upvte (V <moB«ng9 AM^. ff ma 
raipplca ywo rte to rwo chiatara but nalther coma* 
Bportdadtothane(afoiyooua(tQr\o(ypo(Aa).orntoa 
many semcte Ifnoro than 9 of 39k tai outsida the 
throo ootfmal c)(i$ier». a locua psn^ tha dk^atar 
taai gavo ftaa Id ^iher thraa dustsra (ganotypos AA. 
AB. SB) or ffio tMfz (gonotypec aa, a8 or 88. 

Ae». 

29l SubsoQuorasamotasttefaesnotfypadam^ 
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tftfiPo prodafinod dustars. 
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cults cart re^Mlate rhc 4crivttic5 protein 
kit\flise« ( I ). As rhc mctK^wrrwi linking Ra» 
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RasGRP« a Ras Guanyl Nucleotide- 
Releasing Protein with Calcium- and 
Diacylglycerol-Binding Motifs 

Julius 0. EbiT^u, OreH A, Bottorff. Edmond Y. W. Chan, 
Stacey L. Stang, Robert J, Dunn, James C. Stone* 

RmGPP, a guarfyt nuc^eaht^e^e^eosing protein for tho 9tnaXi guarwsine tripho^pnatase 
Ros. was charactanzed. eestde3 the catalytic (tomatn. BasQHP has an atypical pair t 
"EP hands" that bind cdlciufn and a diacytgJycerol (DAO>-t)tmflng doma*n. RasGRP 
acttvatad Ras and caused transformation in fibrobiasts. A DAG an^og caused sustained 
activatton of Ras-Ertc sign^ing and changes m ceU morptiofagy. Signaling was associ- 
ated with oartittoning of RasGRP protein into tha mombrane fraction. Sustarned ligand* 
induced signaling and membrane partttioning were absent when the DAG-binding do- 
main was deleted. RasGRP is expressed in the nervous system, where if may coop>e 
changes tn DAG and possibly catetum concentrations to Ras activation. 



loaa 



SCIENTF. . VOL ZftO . n MAY tV^(* • ww^WJ<icnCcmi^!:.Ot^ 



